Can a website detect when you are using selenium with chromedriver?











up vote
203
down vote

favorite
152












I've been testing out Selenium with Chromedriver and I noticed that some pages can detect that you're using Selenium even though there's no automation at all. Even when I'm just browsing manually just using chrome through Selenium and Xephyr I often get a page saying that suspicious activity was detected. I've checked my user agent, and my browser fingerprint, and they are all exactly identical to the normal chrome browser.



When I browse to these sites in normal chrome everything works fine, but the moment I use Selenium I'm detected.



In theory chromedriver and chrome should look literally exactly the same to any webserver, but somehow they can detect it.



If you want some testcode try out this:



from pyvirtualdisplay import Display
from selenium import webdriver

display = Display(visible=1, size=(1600, 902))
display.start()
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--disable-extensions')
chrome_options.add_argument('--profile-directory=Default')
chrome_options.add_argument("--incognito")
chrome_options.add_argument("--disable-plugins-discovery");
chrome_options.add_argument("--start-maximized")
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.delete_all_cookies()
driver.set_window_size(800,800)
driver.set_window_position(0,0)
print 'arguments done'
driver.get('http://stubhub.com')


If you browse around stubhub you'll get redirected and 'blocked' within one or two requests. I've been investigating this and I can't figure out how they can tell that a user is using Selenium.



How do they do it?



EDIT UPDATE:



I installed the Selenium IDE plugin in Firefox and I got banned when I went to stubhub.com in the normal firefox browser with only the additional plugin.



EDIT:



When I use Fiddler to view the HTTP requests being sent back and forth I've noticed that the 'fake browser's' requests often have 'no-cache' in the response header.



EDIT:



results like this Is there a way to detect that I'm in a Selenium Webdriver page from Javascript suggest that there should be no way to detect when you are using a webdriver. But this evidence suggests otherwise.



EDIT:



The site uploads a fingerprint to their servers, but I checked and the fingerprint of selenium is identical to the fingerprint when using chrome.



EDIT:



This is one of the fingerprint payloads that they send to their servers



{"appName":"Netscape","platform":"Linuxx86_64","cookies":1,"syslang":"en-US","userlang":"en-US","cpu":"","productSub":"20030107","setTimeout":1,"setInterval":1,"plugins":{"0":"ChromePDFViewer","1":"ShockwaveFlash","2":"WidevineContentDecryptionModule","3":"NativeClient","4":"ChromePDFViewer"},"mimeTypes":{"0":"application/pdf","1":"ShockwaveFlashapplication/x-shockwave-flash","2":"FutureSplashPlayerapplication/futuresplash","3":"WidevineContentDecryptionModuleapplication/x-ppapi-widevine-cdm","4":"NativeClientExecutableapplication/x-nacl","5":"PortableNativeClientExecutableapplication/x-pnacl","6":"PortableDocumentFormatapplication/x-google-chrome-pdf"},"screen":{"width":1600,"height":900,"colorDepth":24},"fonts":{"0":"monospace","1":"DejaVuSerif","2":"Georgia","3":"DejaVuSans","4":"TrebuchetMS","5":"Verdana","6":"AndaleMono","7":"DejaVuSansMono","8":"LiberationMono","9":"NimbusMonoL","10":"CourierNew","11":"Courier"}}


Its identical in selenium and in chrome



EDIT:



VPNs work for a single use but get detected after I load the first page. Clearly some javascript is being run to detect Selenium.










share|improve this question




















  • 4




    @RyanWeinstein: It is not traffic. My guess is that Selenium needs to expose some JavaScript hooks which can be detected on the client-side JavaScript.
    – Mikko Ohtamaa
    Oct 21 '15 at 19:04






  • 3




    Or if it is traffic then it is a traffic pattern.... you are browsing pages too fast.
    – Mikko Ohtamaa
    Oct 21 '15 at 19:06






  • 3




    I'm not browsing too fast. I only load a single page and I navigate through it normally using my mouse and keyboard. Also it doesn't make sense that Selenium needs to expose hooks, because its literally running chrome.exe. It just runs normal chrome and allows you to get data from it. Any other ideas? I was thinking maybe it has something to do with cookies. This is driving me crazy.
    – Ryan Weinstein
    Oct 21 '15 at 19:12






  • 4




    This site uses distill bot detection technology and delivers content using akamaitechnologies.com CDN from diffrent ips e.g. 95.100.59.245 , 104.70.243.66 , 23.202.161.241
    – SIslam
    Oct 22 '15 at 10:12






  • 3




    I am experiencing the same issue with Selenium and the firefox driver. The interesting thing to note is I am running Selenium in a VMWare Workstation Virtual Machine that is accessing the internet through a NAT. The host machine is able to access stubhub, while the VM is unable to access when using Selenium, or even the browser instance Selenium launched. I had the VM Browser instance Blocked and stubhub still recognizes the machine and has it blocked. So it must be performing a fingerprint of the browser and machine in some manner.
    – Brian Cain
    Oct 23 '15 at 21:34















up vote
203
down vote

favorite
152












I've been testing out Selenium with Chromedriver and I noticed that some pages can detect that you're using Selenium even though there's no automation at all. Even when I'm just browsing manually just using chrome through Selenium and Xephyr I often get a page saying that suspicious activity was detected. I've checked my user agent, and my browser fingerprint, and they are all exactly identical to the normal chrome browser.



When I browse to these sites in normal chrome everything works fine, but the moment I use Selenium I'm detected.



In theory chromedriver and chrome should look literally exactly the same to any webserver, but somehow they can detect it.



If you want some testcode try out this:



from pyvirtualdisplay import Display
from selenium import webdriver

display = Display(visible=1, size=(1600, 902))
display.start()
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--disable-extensions')
chrome_options.add_argument('--profile-directory=Default')
chrome_options.add_argument("--incognito")
chrome_options.add_argument("--disable-plugins-discovery");
chrome_options.add_argument("--start-maximized")
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.delete_all_cookies()
driver.set_window_size(800,800)
driver.set_window_position(0,0)
print 'arguments done'
driver.get('http://stubhub.com')


If you browse around stubhub you'll get redirected and 'blocked' within one or two requests. I've been investigating this and I can't figure out how they can tell that a user is using Selenium.



How do they do it?



EDIT UPDATE:



I installed the Selenium IDE plugin in Firefox and I got banned when I went to stubhub.com in the normal firefox browser with only the additional plugin.



EDIT:



When I use Fiddler to view the HTTP requests being sent back and forth I've noticed that the 'fake browser's' requests often have 'no-cache' in the response header.



EDIT:



results like this Is there a way to detect that I'm in a Selenium Webdriver page from Javascript suggest that there should be no way to detect when you are using a webdriver. But this evidence suggests otherwise.



EDIT:



The site uploads a fingerprint to their servers, but I checked and the fingerprint of selenium is identical to the fingerprint when using chrome.



EDIT:



This is one of the fingerprint payloads that they send to their servers



{"appName":"Netscape","platform":"Linuxx86_64","cookies":1,"syslang":"en-US","userlang":"en-US","cpu":"","productSub":"20030107","setTimeout":1,"setInterval":1,"plugins":{"0":"ChromePDFViewer","1":"ShockwaveFlash","2":"WidevineContentDecryptionModule","3":"NativeClient","4":"ChromePDFViewer"},"mimeTypes":{"0":"application/pdf","1":"ShockwaveFlashapplication/x-shockwave-flash","2":"FutureSplashPlayerapplication/futuresplash","3":"WidevineContentDecryptionModuleapplication/x-ppapi-widevine-cdm","4":"NativeClientExecutableapplication/x-nacl","5":"PortableNativeClientExecutableapplication/x-pnacl","6":"PortableDocumentFormatapplication/x-google-chrome-pdf"},"screen":{"width":1600,"height":900,"colorDepth":24},"fonts":{"0":"monospace","1":"DejaVuSerif","2":"Georgia","3":"DejaVuSans","4":"TrebuchetMS","5":"Verdana","6":"AndaleMono","7":"DejaVuSansMono","8":"LiberationMono","9":"NimbusMonoL","10":"CourierNew","11":"Courier"}}


Its identical in selenium and in chrome



EDIT:



VPNs work for a single use but get detected after I load the first page. Clearly some javascript is being run to detect Selenium.










share|improve this question




















  • 4




    @RyanWeinstein: It is not traffic. My guess is that Selenium needs to expose some JavaScript hooks which can be detected on the client-side JavaScript.
    – Mikko Ohtamaa
    Oct 21 '15 at 19:04






  • 3




    Or if it is traffic then it is a traffic pattern.... you are browsing pages too fast.
    – Mikko Ohtamaa
    Oct 21 '15 at 19:06






  • 3




    I'm not browsing too fast. I only load a single page and I navigate through it normally using my mouse and keyboard. Also it doesn't make sense that Selenium needs to expose hooks, because its literally running chrome.exe. It just runs normal chrome and allows you to get data from it. Any other ideas? I was thinking maybe it has something to do with cookies. This is driving me crazy.
    – Ryan Weinstein
    Oct 21 '15 at 19:12






  • 4




    This site uses distill bot detection technology and delivers content using akamaitechnologies.com CDN from diffrent ips e.g. 95.100.59.245 , 104.70.243.66 , 23.202.161.241
    – SIslam
    Oct 22 '15 at 10:12






  • 3




    I am experiencing the same issue with Selenium and the firefox driver. The interesting thing to note is I am running Selenium in a VMWare Workstation Virtual Machine that is accessing the internet through a NAT. The host machine is able to access stubhub, while the VM is unable to access when using Selenium, or even the browser instance Selenium launched. I had the VM Browser instance Blocked and stubhub still recognizes the machine and has it blocked. So it must be performing a fingerprint of the browser and machine in some manner.
    – Brian Cain
    Oct 23 '15 at 21:34













up vote
203
down vote

favorite
152









up vote
203
down vote

favorite
152






152





I've been testing out Selenium with Chromedriver and I noticed that some pages can detect that you're using Selenium even though there's no automation at all. Even when I'm just browsing manually just using chrome through Selenium and Xephyr I often get a page saying that suspicious activity was detected. I've checked my user agent, and my browser fingerprint, and they are all exactly identical to the normal chrome browser.



When I browse to these sites in normal chrome everything works fine, but the moment I use Selenium I'm detected.



In theory chromedriver and chrome should look literally exactly the same to any webserver, but somehow they can detect it.



If you want some testcode try out this:



from pyvirtualdisplay import Display
from selenium import webdriver

display = Display(visible=1, size=(1600, 902))
display.start()
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--disable-extensions')
chrome_options.add_argument('--profile-directory=Default')
chrome_options.add_argument("--incognito")
chrome_options.add_argument("--disable-plugins-discovery");
chrome_options.add_argument("--start-maximized")
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.delete_all_cookies()
driver.set_window_size(800,800)
driver.set_window_position(0,0)
print 'arguments done'
driver.get('http://stubhub.com')


If you browse around stubhub you'll get redirected and 'blocked' within one or two requests. I've been investigating this and I can't figure out how they can tell that a user is using Selenium.



How do they do it?



EDIT UPDATE:



I installed the Selenium IDE plugin in Firefox and I got banned when I went to stubhub.com in the normal firefox browser with only the additional plugin.



EDIT:



When I use Fiddler to view the HTTP requests being sent back and forth I've noticed that the 'fake browser's' requests often have 'no-cache' in the response header.



EDIT:



results like this Is there a way to detect that I'm in a Selenium Webdriver page from Javascript suggest that there should be no way to detect when you are using a webdriver. But this evidence suggests otherwise.



EDIT:



The site uploads a fingerprint to their servers, but I checked and the fingerprint of selenium is identical to the fingerprint when using chrome.



EDIT:



This is one of the fingerprint payloads that they send to their servers



{"appName":"Netscape","platform":"Linuxx86_64","cookies":1,"syslang":"en-US","userlang":"en-US","cpu":"","productSub":"20030107","setTimeout":1,"setInterval":1,"plugins":{"0":"ChromePDFViewer","1":"ShockwaveFlash","2":"WidevineContentDecryptionModule","3":"NativeClient","4":"ChromePDFViewer"},"mimeTypes":{"0":"application/pdf","1":"ShockwaveFlashapplication/x-shockwave-flash","2":"FutureSplashPlayerapplication/futuresplash","3":"WidevineContentDecryptionModuleapplication/x-ppapi-widevine-cdm","4":"NativeClientExecutableapplication/x-nacl","5":"PortableNativeClientExecutableapplication/x-pnacl","6":"PortableDocumentFormatapplication/x-google-chrome-pdf"},"screen":{"width":1600,"height":900,"colorDepth":24},"fonts":{"0":"monospace","1":"DejaVuSerif","2":"Georgia","3":"DejaVuSans","4":"TrebuchetMS","5":"Verdana","6":"AndaleMono","7":"DejaVuSansMono","8":"LiberationMono","9":"NimbusMonoL","10":"CourierNew","11":"Courier"}}


Its identical in selenium and in chrome



EDIT:



VPNs work for a single use but get detected after I load the first page. Clearly some javascript is being run to detect Selenium.










share|improve this question















I've been testing out Selenium with Chromedriver and I noticed that some pages can detect that you're using Selenium even though there's no automation at all. Even when I'm just browsing manually just using chrome through Selenium and Xephyr I often get a page saying that suspicious activity was detected. I've checked my user agent, and my browser fingerprint, and they are all exactly identical to the normal chrome browser.



When I browse to these sites in normal chrome everything works fine, but the moment I use Selenium I'm detected.



In theory chromedriver and chrome should look literally exactly the same to any webserver, but somehow they can detect it.



If you want some testcode try out this:



from pyvirtualdisplay import Display
from selenium import webdriver

display = Display(visible=1, size=(1600, 902))
display.start()
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--disable-extensions')
chrome_options.add_argument('--profile-directory=Default')
chrome_options.add_argument("--incognito")
chrome_options.add_argument("--disable-plugins-discovery");
chrome_options.add_argument("--start-maximized")
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.delete_all_cookies()
driver.set_window_size(800,800)
driver.set_window_position(0,0)
print 'arguments done'
driver.get('http://stubhub.com')


If you browse around stubhub you'll get redirected and 'blocked' within one or two requests. I've been investigating this and I can't figure out how they can tell that a user is using Selenium.



How do they do it?



EDIT UPDATE:



I installed the Selenium IDE plugin in Firefox and I got banned when I went to stubhub.com in the normal firefox browser with only the additional plugin.



EDIT:



When I use Fiddler to view the HTTP requests being sent back and forth I've noticed that the 'fake browser's' requests often have 'no-cache' in the response header.



EDIT:



results like this Is there a way to detect that I'm in a Selenium Webdriver page from Javascript suggest that there should be no way to detect when you are using a webdriver. But this evidence suggests otherwise.



EDIT:



The site uploads a fingerprint to their servers, but I checked and the fingerprint of selenium is identical to the fingerprint when using chrome.



EDIT:



This is one of the fingerprint payloads that they send to their servers



{"appName":"Netscape","platform":"Linuxx86_64","cookies":1,"syslang":"en-US","userlang":"en-US","cpu":"","productSub":"20030107","setTimeout":1,"setInterval":1,"plugins":{"0":"ChromePDFViewer","1":"ShockwaveFlash","2":"WidevineContentDecryptionModule","3":"NativeClient","4":"ChromePDFViewer"},"mimeTypes":{"0":"application/pdf","1":"ShockwaveFlashapplication/x-shockwave-flash","2":"FutureSplashPlayerapplication/futuresplash","3":"WidevineContentDecryptionModuleapplication/x-ppapi-widevine-cdm","4":"NativeClientExecutableapplication/x-nacl","5":"PortableNativeClientExecutableapplication/x-pnacl","6":"PortableDocumentFormatapplication/x-google-chrome-pdf"},"screen":{"width":1600,"height":900,"colorDepth":24},"fonts":{"0":"monospace","1":"DejaVuSerif","2":"Georgia","3":"DejaVuSans","4":"TrebuchetMS","5":"Verdana","6":"AndaleMono","7":"DejaVuSansMono","8":"LiberationMono","9":"NimbusMonoL","10":"CourierNew","11":"Courier"}}


Its identical in selenium and in chrome



EDIT:



VPNs work for a single use but get detected after I load the first page. Clearly some javascript is being run to detect Selenium.







javascript python google-chrome selenium selenium-chromedriver






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited May 23 '17 at 10:31









Community

11




11










asked Oct 20 '15 at 0:08









Ryan Weinstein

1,42031221




1,42031221








  • 4




    @RyanWeinstein: It is not traffic. My guess is that Selenium needs to expose some JavaScript hooks which can be detected on the client-side JavaScript.
    – Mikko Ohtamaa
    Oct 21 '15 at 19:04






  • 3




    Or if it is traffic then it is a traffic pattern.... you are browsing pages too fast.
    – Mikko Ohtamaa
    Oct 21 '15 at 19:06






  • 3




    I'm not browsing too fast. I only load a single page and I navigate through it normally using my mouse and keyboard. Also it doesn't make sense that Selenium needs to expose hooks, because its literally running chrome.exe. It just runs normal chrome and allows you to get data from it. Any other ideas? I was thinking maybe it has something to do with cookies. This is driving me crazy.
    – Ryan Weinstein
    Oct 21 '15 at 19:12






  • 4




    This site uses distill bot detection technology and delivers content using akamaitechnologies.com CDN from diffrent ips e.g. 95.100.59.245 , 104.70.243.66 , 23.202.161.241
    – SIslam
    Oct 22 '15 at 10:12






  • 3




    I am experiencing the same issue with Selenium and the firefox driver. The interesting thing to note is I am running Selenium in a VMWare Workstation Virtual Machine that is accessing the internet through a NAT. The host machine is able to access stubhub, while the VM is unable to access when using Selenium, or even the browser instance Selenium launched. I had the VM Browser instance Blocked and stubhub still recognizes the machine and has it blocked. So it must be performing a fingerprint of the browser and machine in some manner.
    – Brian Cain
    Oct 23 '15 at 21:34














  • 4




    @RyanWeinstein: It is not traffic. My guess is that Selenium needs to expose some JavaScript hooks which can be detected on the client-side JavaScript.
    – Mikko Ohtamaa
    Oct 21 '15 at 19:04






  • 3




    Or if it is traffic then it is a traffic pattern.... you are browsing pages too fast.
    – Mikko Ohtamaa
    Oct 21 '15 at 19:06






  • 3




    I'm not browsing too fast. I only load a single page and I navigate through it normally using my mouse and keyboard. Also it doesn't make sense that Selenium needs to expose hooks, because its literally running chrome.exe. It just runs normal chrome and allows you to get data from it. Any other ideas? I was thinking maybe it has something to do with cookies. This is driving me crazy.
    – Ryan Weinstein
    Oct 21 '15 at 19:12






  • 4




    This site uses distill bot detection technology and delivers content using akamaitechnologies.com CDN from diffrent ips e.g. 95.100.59.245 , 104.70.243.66 , 23.202.161.241
    – SIslam
    Oct 22 '15 at 10:12






  • 3




    I am experiencing the same issue with Selenium and the firefox driver. The interesting thing to note is I am running Selenium in a VMWare Workstation Virtual Machine that is accessing the internet through a NAT. The host machine is able to access stubhub, while the VM is unable to access when using Selenium, or even the browser instance Selenium launched. I had the VM Browser instance Blocked and stubhub still recognizes the machine and has it blocked. So it must be performing a fingerprint of the browser and machine in some manner.
    – Brian Cain
    Oct 23 '15 at 21:34








4




4




@RyanWeinstein: It is not traffic. My guess is that Selenium needs to expose some JavaScript hooks which can be detected on the client-side JavaScript.
– Mikko Ohtamaa
Oct 21 '15 at 19:04




@RyanWeinstein: It is not traffic. My guess is that Selenium needs to expose some JavaScript hooks which can be detected on the client-side JavaScript.
– Mikko Ohtamaa
Oct 21 '15 at 19:04




3




3




Or if it is traffic then it is a traffic pattern.... you are browsing pages too fast.
– Mikko Ohtamaa
Oct 21 '15 at 19:06




Or if it is traffic then it is a traffic pattern.... you are browsing pages too fast.
– Mikko Ohtamaa
Oct 21 '15 at 19:06




3




3




I'm not browsing too fast. I only load a single page and I navigate through it normally using my mouse and keyboard. Also it doesn't make sense that Selenium needs to expose hooks, because its literally running chrome.exe. It just runs normal chrome and allows you to get data from it. Any other ideas? I was thinking maybe it has something to do with cookies. This is driving me crazy.
– Ryan Weinstein
Oct 21 '15 at 19:12




I'm not browsing too fast. I only load a single page and I navigate through it normally using my mouse and keyboard. Also it doesn't make sense that Selenium needs to expose hooks, because its literally running chrome.exe. It just runs normal chrome and allows you to get data from it. Any other ideas? I was thinking maybe it has something to do with cookies. This is driving me crazy.
– Ryan Weinstein
Oct 21 '15 at 19:12




4




4




This site uses distill bot detection technology and delivers content using akamaitechnologies.com CDN from diffrent ips e.g. 95.100.59.245 , 104.70.243.66 , 23.202.161.241
– SIslam
Oct 22 '15 at 10:12




This site uses distill bot detection technology and delivers content using akamaitechnologies.com CDN from diffrent ips e.g. 95.100.59.245 , 104.70.243.66 , 23.202.161.241
– SIslam
Oct 22 '15 at 10:12




3




3




I am experiencing the same issue with Selenium and the firefox driver. The interesting thing to note is I am running Selenium in a VMWare Workstation Virtual Machine that is accessing the internet through a NAT. The host machine is able to access stubhub, while the VM is unable to access when using Selenium, or even the browser instance Selenium launched. I had the VM Browser instance Blocked and stubhub still recognizes the machine and has it blocked. So it must be performing a fingerprint of the browser and machine in some manner.
– Brian Cain
Oct 23 '15 at 21:34




I am experiencing the same issue with Selenium and the firefox driver. The interesting thing to note is I am running Selenium in a VMWare Workstation Virtual Machine that is accessing the internet through a NAT. The host machine is able to access stubhub, while the VM is unable to access when using Selenium, or even the browser instance Selenium launched. I had the VM Browser instance Blocked and stubhub still recognizes the machine and has it blocked. So it must be performing a fingerprint of the browser and machine in some manner.
– Brian Cain
Oct 23 '15 at 21:34












14 Answers
14






active

oldest

votes

















up vote
7
down vote



accepted










For Mac Users



Replacing cdc_ variable using Vim or Perl



You can use vim, or as @Vic Seedoubleyew has pointed out in the answer by @Erti-Chris Eelmaa, perl, to replace the cdc_ variable in chromedriver(See post by @Erti-Chris Eelmaa to learn more about that variable). Using vim or perl prevents you from having to recompile source code or use a hex-editor. Make sure to make a copy of the original chromedriver before attempting to edit it. Also, the methods below were tested on chromedriver version 2.41.578706.





Using Vim



vim /path/to/chromedriver


After running the line above, you'll probably see a bunch of gibberish. Do the following:




  1. Search for cdc_ by typing /cdc_ and pressing return.

  2. Enable editing by pressing a.

  3. Delete any amount of $cdc_lasutopfhvcZLmcfl and replace what was deleted with an equal amount characters. If you don't, chromedriver will fail.

  4. After you're done editing, press esc.

  5. To save the changes and quit, type :wq! and press return.

  6. If you don't want to save the changes, but you want to quit, type :q! and press return.

  7. You're done.


Go to the altered chromedriver and double click on it. A terminal window should open up. If you don't see killed in the output, you successfully altered the driver.





Using Perl



The line below replaces cdc_ with dog_:



perl -pi -e 's/cdc_/dog_/g' /path/to/chromedriver


Make sure that the replacement string has the same number of characters as the search string, otherwise the chromedriver will fail.



Perl Explanation



s///g denotes that you want to search for a string and replace it globally with another string (replaces all occurrences).




e.g., s/string/replacment/g




So,




s/// denotes searching for and replacing a string.



cdc_ is the search string.



dog_ is the replacement string.



g is the global key, which replaces every occurrence of the string.




How to check if the Perl replacement worked



The following line will print every occurrence of the search string cdc_:



perl -ne 'while(/cdc_/g){print "$&n";}' /path/to/chromedriver



If this returns nothing, then cdc_ has been replaced.



Conversely, you can use the this:



perl -ne 'while(/dog_/g){print "$&n";}' /path/to/chromedriver



to see if your replacement string, dog_, is now in the chromedriver binary. If it is, the replacement string will be printed to the console.



Go to the altered chromedriver and double click on it. A terminal window should open up. If you don't see killed in the output, you successfully altered the driver.





Wrapping Up



After altering the chromedriver binary, make sure that the name of the altered chromedriver binary is chromedriver, and that the original binary is either moved from its original location or renamed.





My Experience With This Method



I was previously being detected on a website while trying to log in, but after replacing cdc_ with an equal sized string, I was able to log in. Like others have said though, if you've already been detected, you might get blocked for a plethora of other reasons even after using this method. So you may have to try accessing the site that was detecting you using a VPN, different network, or what have you.






share|improve this answer























  • This looks like one of the easiest solutions if it works.
    – Ryan Weinstein
    Sep 14 at 0:07






  • 2




    this doesn't work with current chromedriver version
    – Leka Baper
    Nov 9 at 21:06










  • @LekaBaper Thanks for the heads up. The chromedriver version that I used was version 2.41.578706.
    – colossatr0n
    Nov 12 at 4:47










  • This doesn't work anymore. Checkout my answer...
    – ShayanKM
    Dec 5 at 13:02


















up vote
100
down vote













Basically the way the selenium detection works, is that they test for pre-defined javascript variables which appear when running with selenium. The bot detection scripts usually look anything containing word "selenium" / "webdriver" in any of the variables (on window object), and also document variables called $cdc_ and $wdc_. Of course, all of this depends on which browser you are on. All the different browsers expose different things.



For me, I used chrome, so, all that I had to do was to ensure that $cdc_ didn't exist anymore as document variable, and voila (download chromedriver source code, modify chromedriver and re-compile $cdc_ under different name.)



this is the function I modified in chromedriver:



call_function.js:



function getPageCache(opt_doc) {
var doc = opt_doc || document;
//var key = '$cdc_asdjflasutopfhvcZLmcfl_';
var key = 'randomblabla_';
if (!(key in doc))
doc[key] = new Cache();
return doc[key];
}


(note the comment, all I did I turned $cdc_ to randomblabla_.



Here is a pseudo-code which demonstrates some of the techniques that bot networks might use:



runBotDetection = function () {
var documentDetectionKeys = [
"__webdriver_evaluate",
"__selenium_evaluate",
"__webdriver_script_function",
"__webdriver_script_func",
"__webdriver_script_fn",
"__fxdriver_evaluate",
"__driver_unwrapped",
"__webdriver_unwrapped",
"__driver_evaluate",
"__selenium_unwrapped",
"__fxdriver_unwrapped",
];

var windowDetectionKeys = [
"_phantom",
"__nightmare",
"_selenium",
"callPhantom",
"callSelenium",
"_Selenium_IDE_Recorder",
];

for (const windowDetectionKey in windowDetectionKeys) {
const windowDetectionKeyValue = windowDetectionKeys[windowDetectionKey];
if (window[windowDetectionKeyValue]) {
return true;
}
};
for (const documentDetectionKey in documentDetectionKeys) {
const documentDetectionKeyValue = documentDetectionKeys[documentDetectionKey];
if (window['document'][documentDetectionKeyValue]) {
return true;
}
};

for (const documentKey in window['document']) {
if (documentKey.match(/$[a-z]dc_/) && window['document'][documentKey]['cache_']) {
return true;
}
}

if (window['external'] && window['external'].toString() && (window['external'].toString()['indexOf']('Sequentum') != -1)) return true;

if (window['document']['documentElement']['getAttribute']('selenium')) return true;
if (window['document']['documentElement']['getAttribute']('webdriver')) return true;
if (window['document']['documentElement']['getAttribute']('driver')) return true;

return false;
};


according to user @szx, it is also possible to simply open chromedriver.exe in hex editor, and just do the replacement manually, without actually doing any compiling.






share|improve this answer



















  • 17




    yes it worked without probs, note one problem is if you fell into the "blacklist" BEFORE this change, it's quite hard to get out. if you want to get out of the existing black list, you need to implement fake canvas fingerprinting, disable flash, change IP, and change request header order (swap language and Accept headers). Once you fell into the blacklist, they have very good measures to track you, even if you change IP, even if you open chrome in incognito, etc
    – Erti-Chris Eelmaa
    Dec 19 '16 at 19:23








  • 4




    This is very interesting, thanks for going to the trouble. Its been more than a year since I asked this question and its nice to finally have an answer. I'll see if I can pass you the check-mark once I've tested the solution myself.
    – Ryan Weinstein
    Dec 19 '16 at 22:41






  • 3




    What is the most straight forward way to compile chromedriver on Windows?
    – Arya
    Jun 23 '17 at 19:34






  • 2




    I found the file "/Users/your_username/chromium/src/chrome/test/chromedriver/js"
    – JonghoKim
    Jul 1 '17 at 9:08






  • 5




    I simply replaced $cdc with xxxx in chromedriver.exe in a hex editor and it worked! I also noticed that if you maximize the browser window (rather than use a predefined size) it's detected less often.
    – szx
    Feb 25 at 17:59




















up vote
68
down vote



+100










As we've already figured out in the question and the posted answers, there is an anti Web-scraping and a Bot detection service called "Distil Networks" in play here. And, according to the company CEO's interview:




Even though they can create new bots, we figured out a way to identify
Selenium the a tool they’re using, so we’re blocking Selenium no
matter how many times they iterate on that bot
. We’re doing that now
with Python and a lot of different technologies. Once we see a pattern
emerge from one type of bot, then we work to reverse engineer the
technology they use and identify it as malicious.




It'll take time and additional challenges to understand how exactly they are detecting Selenium, but what can we say for sure at the moment:




  • it's not related to the actions you take with selenium - once you navigate to the site, you get immediately detected and banned. I've tried to add artificial random delays between actions, take a pause after the page is loaded - nothing helped

  • it's not about browser fingerprint either - tried it in multiple browsers with clean profiles and not, incognito modes - nothing helped

  • since, according to the hint in the interview, this was "reverse engineering", I suspect this is done with some JS code being executed in the browser revealing that this is a browser automated via selenium webdriver


Decided to post it as an answer, since clearly:




Can a website detect when you are using selenium with chromedriver?




Yes.





Also, what I haven't experimented with is older selenium and older browser versions - in theory, there could be something implemented/added to selenium at a certain point that Distil Networks bot detector currently relies on. Then, if this is the case, we might detect (yeah, let's detect the detector) at what point/version a relevant change was made, look into changelog and changesets and, may be, this could give us more information on where to look and what is it they use to detect a webdriver-powered browser. It's just a theory that needs to be tested.






share|improve this answer























  • This is crazy. So they really have a way of detecting it that no one else has. I really want to figure out how they're doing it. Can you provide any other information at all as to how they could possibly be doing it?
    – Ryan Weinstein
    Oct 29 '15 at 20:50










  • @RyanWeinstein well, we have no actual proof and we can only speculate and test. For now, I would say they have a way to detect us using selenium. Try experimenting with selenium versions - this may give you some clues.
    – alecxe
    Oct 29 '15 at 22:19






  • 1




    Could it have to do with how ephemeral ports are determined? The method stays away from well-known ranges. github.com/SeleniumHQ/selenium/blob/…
    – Elliott
    Jan 12 '16 at 22:12








  • 6




    Easyjet are using distilnetwork service, yeah it can block dummy bots but not the complicated ones because we have tested it with more than 2000 requests a day from different IPs (which we re-use again 'same' address) so basicly each IP go for a 5-10 requests a day and from this I can tell that all this bot detecting services are just there to develop and sell some 45% working algorithmes, the scrapper we used was easy to detect I can block it while destilnetworks, squareshield and others couldn't which pushed me to never use any of them.
    – Jeffery ThaGintoki
    Feb 20 '17 at 18:16






  • 2




    @alecxe Any updates on this? Still can't use Selenium with Distil?
    – Utku
    May 21 at 15:17


















up vote
18
down vote













Example of how it's implemented on wellsfargo.com:



try {
if (window.document.documentElement.getAttribute("webdriver")) return !+
} catch (IDLMrxxel) {}
try {
if ("_Selenium_IDE_Recorder" in window) return !+""
} catch (KknKsUayS) {}
try {
if ("__webdriver_script_fn" in document) return !+""





share|improve this answer



















  • 2




    why is the last try not closed ? besides can u explain your answer a little.
    – ishandutta2007
    Aug 22 at 16:42


















up vote
8
down vote













Try to use selenium with a specific user profile of chrome, That way you can use it as specific user and define any thing you want, When doing so it will run as a 'real' user, look at chrome process with some process explorer and you'll see the difference with the tags.



For example:



username = os.getenv("USERNAME")
userProfile = "C:\Users\" + username + "\AppData\Local\Google\Chrome\User Data\Default"
options = webdriver.ChromeOptions()
options.add_argument("user-data-dir={}".format(userProfile))
# add here any tag you want.
options.add_experimental_option("excludeSwitches", ["ignore-certificate-errors", "safebrowsing-disable-download-protection", "safebrowsing-disable-auto-update", "disable-client-side-phishing-detection"])
chromedriver = "C:Python27chromedriverchromedriver.exe"
os.environ["webdriver.chrome.driver"] = chromedriver
browser = webdriver.Chrome(executable_path=chromedriver, chrome_options=options)


chrome tag list here






share|improve this answer




























    up vote
    7
    down vote














    partial interface Navigator {
    readonly attribute boolean webdriver;
    };



    The webdriver IDL attribute of the Navigator interface must return the value of the webdriver-active flag, which is initially false.



    This property allows websites to determine that the user agent is under control by WebDriver, and can be used to help mitigate denial-of-service attacks.




    Taken directly from the 2017 W3C Editor's Draft of WebDriver. This heavily implies that at the very least, future iterations of selenium's drivers will be identifiable to prevent misuse. Ultimately, it's hard to tell without the source code, what exactly causes chrome driver in specific to be detectable.






    share|improve this answer

















    • 4




      "it's hard to tell without the source code" .. well the source code is freely available
      – Corey Goldberg
      Nov 27 '17 at 16:08






    • 2




      I meant without the website in question's source code. It's hard to tell what they are checking against.
      – bryce
      Mar 19 at 21:12


















    up vote
    5
    down vote













    Even if you are sending all the right data (e.g. Selenium doesn't show up as an extension, you have a reasonable resolution/bit-depth, &c), there are a number of services and tools which profile visitor behaviour to determine whether the actor is a user or an automated system.



    For example, visiting a site then immediately going to perform some action by moving the mouse directly to the relevant button, in less than a second, is something no user would actually do.



    It might also be useful as a debugging tool to use a site such as https://panopticlick.eff.org/ to check how unique your browser is; it'll also help you verify whether there are any specific parameters that indicate you're running in Selenium.






    share|improve this answer

















    • 1




      I've already used that website and the fingerprint is identical to my normal browser. Also I'm not automating anything. I'm just browsing as normal.
      – Ryan Weinstein
      Oct 26 '15 at 4:46


















    up vote
    4
    down vote













    It sounds like they are behind a web application firewall. Take a look at modsecurity and owasp to see how those work. In reality, what you are asking is how to do bot detection evasion. That is not what selenium web driver is for. It is for testing your web application not hitting other web applications. It is possible, but basically, you'd have to look at what a WAF looks for in their rule set and specifically avoid it with selenium if you can. Even then, it might still not work because you don't know what WAF they are using. You did the right first step, that is faking the user agent. If that didn't work though, then a WAF is in place and you probably need to get more tricky.



    Edit:
    Point taken from other answer. Make sure your user agent is actually being set correctly first. Maybe have it hit a local web server or sniff the traffic going out.






    share|improve this answer























    • I think you are on the correct path. I tested with my setup and replaced the User Agent with a valid user agent string that successfully went through and received the same result, stubhub blocked the request.
      – Brian Cain
      Oct 23 '15 at 23:36










    • Okay, if the user agent is fine, then they have attack detection in place for sure. WAF is a good place to start. Not that I'm condoning hitting other websites. I'm just answering in the name of science and advancement of human knowledge.
      – Bassel Samman
      Oct 23 '15 at 23:49






    • 1




      If it was an HTTP header issue then wouldn't the normal browser get blocked? The HTTP headers are exactly the same. Also what exactly am I looking at with that github link? Have you tried using selenium to go on stubhub? Something is very very off.
      – Ryan Weinstein
      Oct 26 '15 at 21:15






    • 1




      I'm sorry for the confusion. I'll look into that and you don't have to help me anymore if you don't want to. Most of my experience is in programming systems applications, so I was not familiar with these modsecurity rules that you're talking about. I'll take a look and try to educate myself. I'm not trying to bypass anything, I was just interested in knowing how these websites detect a user using selenium.
      – Ryan Weinstein
      Oct 27 '15 at 18:49






    • 1




      I'm a developer too :). Learning is a cause I can get behind. I don't mind helping, I just wanted to make clear that I didn't know your intentions and could not exactly help you bypass their website security. To answer your question though, it is not selenium that they are detecting. The rules detected suspicious behavior and decided to take the appropriate measures against the offending client. They catch you by what you are not doing more than by what you are doing. In the repo link, you can checkout this file to get an idea base_rules/modsecurity_crs_20_protocol_violations.conf
      – Bassel Samman
      Oct 28 '15 at 1:29


















    up vote
    4
    down vote













    Firefox is said to set window.navigator.webdriver === true if working with a webdriver. That was according to one of the older specs (e.g.: archive.org) but I couldn't find it in the new one except for some very vague wording in the appendices.



    A test for it is in the selenium code in the file fingerprint_test.js where the comment at the end says "Currently only implemented in firefox" but I wasn't able to identify any code in that direction with some simple greping, neither in the current (41.0.2) Firefox release-tree nor in the Chromium-tree.



    I also found a comment for an older commit regarding fingerprinting in the firefox driver b82512999938 from January 2015. That code is still in the Selenium GIT-master downloaded yesterday at javascript/firefox-driver/extension/content/server.js with a comment linking to the slightly differently worded appendix in the current w3c webdriver spec.






    share|improve this answer

















    • 1




      I just tested webdriver with Firefox 55 and I can confirm this is not true. The variable window.navigator.webdriver is not defined.
      – speedplane
      Oct 2 '17 at 17:56


















    up vote
    3
    down vote













    The bot detection I've seen seems more sophisticated or at least different than what I've read through in the answers below.



    EXPERIMENT 1:




    1. I open a browser and web page with Selenium from a Python console.

    2. The mouse is already at a specific location where I know a link will appear once the page loads. I never move the mouse.

    3. I press the left mouse button once (this is necessary to take focus from the console where Python is running to the browser).

    4. I press the left mouse button again (remember, cursor is above a given link).

    5. The link opens normally, as it should.


    EXPERIMENT 2:




    1. As before, I open a browser and the web page with Selenium from a Python console.


    2. This time around, instead of clicking with the mouse, I use Selenium (in the Python console) to click the same element with a random offset.


    3. The link doesn't open, but I am taken to a sign up page.



    IMPLICATIONS:




    • opening a web browser via Selenium doesn't preclude me from appearing human

    • moving the mouse like a human is not necessary to be classified as human

    • clicking something via Selenium with an offset still raises the alarm


    Seems mysterious, but I guess they can just determine whether an action originates from Selenium or not, while they don't care whether the browser itself was opened via Selenium or not. Or can they determine if the window has focus? Would be interesting to hear if anyone has any insights.






    share|improve this answer



















    • 1




      My belief is that Selenium injects something into the page via javascript to find and access elements. This injection is what I believe they are detecting.
      – zeusalmighty
      Oct 25 at 13:31


















    up vote
    2
    down vote













    Obfuscating JavaScripts result



    I have checked the chromedriver source code. That injects some javascript files to the browser.
    Every javascript file on this link is injected to the web pages:
    https://chromium.googlesource.com/chromium/src/+/master/chrome/test/chromedriver/js/



    So I used reverse engineering and obfuscated the js files by Hex editing. Now i was sure that no more javascript variable, function names and fixed strings were used to uncover selenium activity. But still some sites and reCaptcha detect selenium!

    Maybe they check the modifications that are caused by chromedriver js execution :)




    Edit 1:



    Chrome 'navigator' parameters modification



    I discovered there are some parameters in 'navigator' that briefly uncover using of chromedriver.
    These are the parameters:





    • "navigator.webdriver" On non-automated mode it is 'undefined'. On automated mode it's 'true'.


    • "navigator.plugins" On headless chrome has 0 length. So I added some fake elements to fool the plugin length checking process.

    • "navigator.languages" was set to default chrome value '["en-US", "en", "es"]' .


    So what i needed was a chrome extension to run javascript on the web pages. I made an extension with the js code provided in the article and used another article to add the zipped extension to my project. I have successfully changed the values; But still nothing changed!



    I didn't find other variables like these but it doesn't mean that they don't exist. Still reCaptcha detects chromedriver, So there should be more variables to change. The next step should be reverse engineering of the detector services that i don't want to do.



    Now I'm not sure does it worth to spend more time on this automation process or search for alternative methods!






    share|improve this answer






























      up vote
      1
      down vote













      Write an html page with the following code. You will see that in the DOM selenium applies a webdriver attribute in the outerHTML






      <html>
      <head>
      <script type="text/javascript">
      <!--
      function showWindow(){
      javascript:(alert(document.documentElement.outerHTML));
      }
      //-->
      </script>
      </head>
      <body>
      <form>
      <input type="button" value="Show outerHTML" onclick="showWindow()">
      </form>
      </body>
      </html>








      share|improve this answer

















      • 4




        The attribute is added only in Firefox.
        – Louis
        Oct 28 '15 at 9:22






      • 1




        And it is possible to remove it from the selenium extension that controlls browser. It will work anyway.
        – erm3nda
        Jun 12 '17 at 23:53


















      up vote
      1
      down vote













      Some sites are detecting this:



      function d() {
      try {
      if (window.document.$cdc_asdjflasutopfhvcZLmcfl_.cache_)
      return !0
      } catch (e) {}

      try {
      //if (window.document.documentElement.getAttribute(decodeURIComponent("%77%65%62%64%72%69%76%65%72")))
      if (window.document.documentElement.getAttribute("webdriver"))
      return !0
      } catch (e) {}

      try {
      //if (decodeURIComponent("%5F%53%65%6C%65%6E%69%75%6D%5F%49%44%45%5F%52%65%63%6F%72%64%65%72") in window)
      if ("_Selenium_IDE_Recorder" in window)
      return !0
      } catch (e) {}

      try {
      //if (decodeURIComponent("%5F%5F%77%65%62%64%72%69%76%65%72%5F%73%63%72%69%70%74%5F%66%6E") in document)
      if ("__webdriver_script_fn" in document)
      return !0
      } catch (e) {}





      share|improve this answer





















      • This doesn't work for Chrome and Firefox, selenium 3.5.0, ChromeDriver 2.31.488774, geckodriver 0.18.0
        – jerrypy
        Aug 29 '17 at 6:35


















      up vote
      0
      down vote













      It seems to me the simplest way to do it with Selenium is to intercept the XHR that sends back the browser fingerprint.



      But since this is a Selenium-only problem, its better just to use something else. Selenium is supposed to make things like this easier, not way harder.






      share|improve this answer





















      • What are other options to selenium?
        – Tai
        Dec 3 at 17:38










      • And can they be detected as well?
        – Tai
        Dec 3 at 17:45










      • I guess Requests would be the main python option. If you send the same exact requests that your browser sends, you will appear as a normal browser.
        – pguardiario
        Dec 3 at 23:14










      protected by Mark Rotteveel Nov 14 at 16:33



      Thank you for your interest in this question.
      Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



      Would you like to answer one of these unanswered questions instead?














      14 Answers
      14






      active

      oldest

      votes








      14 Answers
      14






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      7
      down vote



      accepted










      For Mac Users



      Replacing cdc_ variable using Vim or Perl



      You can use vim, or as @Vic Seedoubleyew has pointed out in the answer by @Erti-Chris Eelmaa, perl, to replace the cdc_ variable in chromedriver(See post by @Erti-Chris Eelmaa to learn more about that variable). Using vim or perl prevents you from having to recompile source code or use a hex-editor. Make sure to make a copy of the original chromedriver before attempting to edit it. Also, the methods below were tested on chromedriver version 2.41.578706.





      Using Vim



      vim /path/to/chromedriver


      After running the line above, you'll probably see a bunch of gibberish. Do the following:




      1. Search for cdc_ by typing /cdc_ and pressing return.

      2. Enable editing by pressing a.

      3. Delete any amount of $cdc_lasutopfhvcZLmcfl and replace what was deleted with an equal amount characters. If you don't, chromedriver will fail.

      4. After you're done editing, press esc.

      5. To save the changes and quit, type :wq! and press return.

      6. If you don't want to save the changes, but you want to quit, type :q! and press return.

      7. You're done.


      Go to the altered chromedriver and double click on it. A terminal window should open up. If you don't see killed in the output, you successfully altered the driver.





      Using Perl



      The line below replaces cdc_ with dog_:



      perl -pi -e 's/cdc_/dog_/g' /path/to/chromedriver


      Make sure that the replacement string has the same number of characters as the search string, otherwise the chromedriver will fail.



      Perl Explanation



      s///g denotes that you want to search for a string and replace it globally with another string (replaces all occurrences).




      e.g., s/string/replacment/g




      So,




      s/// denotes searching for and replacing a string.



      cdc_ is the search string.



      dog_ is the replacement string.



      g is the global key, which replaces every occurrence of the string.




      How to check if the Perl replacement worked



      The following line will print every occurrence of the search string cdc_:



      perl -ne 'while(/cdc_/g){print "$&n";}' /path/to/chromedriver



      If this returns nothing, then cdc_ has been replaced.



      Conversely, you can use the this:



      perl -ne 'while(/dog_/g){print "$&n";}' /path/to/chromedriver



      to see if your replacement string, dog_, is now in the chromedriver binary. If it is, the replacement string will be printed to the console.



      Go to the altered chromedriver and double click on it. A terminal window should open up. If you don't see killed in the output, you successfully altered the driver.





      Wrapping Up



      After altering the chromedriver binary, make sure that the name of the altered chromedriver binary is chromedriver, and that the original binary is either moved from its original location or renamed.





      My Experience With This Method



      I was previously being detected on a website while trying to log in, but after replacing cdc_ with an equal sized string, I was able to log in. Like others have said though, if you've already been detected, you might get blocked for a plethora of other reasons even after using this method. So you may have to try accessing the site that was detecting you using a VPN, different network, or what have you.






      share|improve this answer























      • This looks like one of the easiest solutions if it works.
        – Ryan Weinstein
        Sep 14 at 0:07






      • 2




        this doesn't work with current chromedriver version
        – Leka Baper
        Nov 9 at 21:06










      • @LekaBaper Thanks for the heads up. The chromedriver version that I used was version 2.41.578706.
        – colossatr0n
        Nov 12 at 4:47










      • This doesn't work anymore. Checkout my answer...
        – ShayanKM
        Dec 5 at 13:02















      up vote
      7
      down vote



      accepted










      For Mac Users



      Replacing cdc_ variable using Vim or Perl



      You can use vim, or as @Vic Seedoubleyew has pointed out in the answer by @Erti-Chris Eelmaa, perl, to replace the cdc_ variable in chromedriver(See post by @Erti-Chris Eelmaa to learn more about that variable). Using vim or perl prevents you from having to recompile source code or use a hex-editor. Make sure to make a copy of the original chromedriver before attempting to edit it. Also, the methods below were tested on chromedriver version 2.41.578706.





      Using Vim



      vim /path/to/chromedriver


      After running the line above, you'll probably see a bunch of gibberish. Do the following:




      1. Search for cdc_ by typing /cdc_ and pressing return.

      2. Enable editing by pressing a.

      3. Delete any amount of $cdc_lasutopfhvcZLmcfl and replace what was deleted with an equal amount characters. If you don't, chromedriver will fail.

      4. After you're done editing, press esc.

      5. To save the changes and quit, type :wq! and press return.

      6. If you don't want to save the changes, but you want to quit, type :q! and press return.

      7. You're done.


      Go to the altered chromedriver and double click on it. A terminal window should open up. If you don't see killed in the output, you successfully altered the driver.





      Using Perl



      The line below replaces cdc_ with dog_:



      perl -pi -e 's/cdc_/dog_/g' /path/to/chromedriver


      Make sure that the replacement string has the same number of characters as the search string, otherwise the chromedriver will fail.



      Perl Explanation



      s///g denotes that you want to search for a string and replace it globally with another string (replaces all occurrences).




      e.g., s/string/replacment/g




      So,




      s/// denotes searching for and replacing a string.



      cdc_ is the search string.



      dog_ is the replacement string.



      g is the global key, which replaces every occurrence of the string.




      How to check if the Perl replacement worked



      The following line will print every occurrence of the search string cdc_:



      perl -ne 'while(/cdc_/g){print "$&n";}' /path/to/chromedriver



      If this returns nothing, then cdc_ has been replaced.



      Conversely, you can use the this:



      perl -ne 'while(/dog_/g){print "$&n";}' /path/to/chromedriver



      to see if your replacement string, dog_, is now in the chromedriver binary. If it is, the replacement string will be printed to the console.



      Go to the altered chromedriver and double click on it. A terminal window should open up. If you don't see killed in the output, you successfully altered the driver.





      Wrapping Up



      After altering the chromedriver binary, make sure that the name of the altered chromedriver binary is chromedriver, and that the original binary is either moved from its original location or renamed.





      My Experience With This Method



      I was previously being detected on a website while trying to log in, but after replacing cdc_ with an equal sized string, I was able to log in. Like others have said though, if you've already been detected, you might get blocked for a plethora of other reasons even after using this method. So you may have to try accessing the site that was detecting you using a VPN, different network, or what have you.






      share|improve this answer























      • This looks like one of the easiest solutions if it works.
        – Ryan Weinstein
        Sep 14 at 0:07






      • 2




        this doesn't work with current chromedriver version
        – Leka Baper
        Nov 9 at 21:06










      • @LekaBaper Thanks for the heads up. The chromedriver version that I used was version 2.41.578706.
        – colossatr0n
        Nov 12 at 4:47










      • This doesn't work anymore. Checkout my answer...
        – ShayanKM
        Dec 5 at 13:02













      up vote
      7
      down vote



      accepted







      up vote
      7
      down vote



      accepted






      For Mac Users



      Replacing cdc_ variable using Vim or Perl



      You can use vim, or as @Vic Seedoubleyew has pointed out in the answer by @Erti-Chris Eelmaa, perl, to replace the cdc_ variable in chromedriver(See post by @Erti-Chris Eelmaa to learn more about that variable). Using vim or perl prevents you from having to recompile source code or use a hex-editor. Make sure to make a copy of the original chromedriver before attempting to edit it. Also, the methods below were tested on chromedriver version 2.41.578706.





      Using Vim



      vim /path/to/chromedriver


      After running the line above, you'll probably see a bunch of gibberish. Do the following:




      1. Search for cdc_ by typing /cdc_ and pressing return.

      2. Enable editing by pressing a.

      3. Delete any amount of $cdc_lasutopfhvcZLmcfl and replace what was deleted with an equal amount characters. If you don't, chromedriver will fail.

      4. After you're done editing, press esc.

      5. To save the changes and quit, type :wq! and press return.

      6. If you don't want to save the changes, but you want to quit, type :q! and press return.

      7. You're done.


      Go to the altered chromedriver and double click on it. A terminal window should open up. If you don't see killed in the output, you successfully altered the driver.





      Using Perl



      The line below replaces cdc_ with dog_:



      perl -pi -e 's/cdc_/dog_/g' /path/to/chromedriver


      Make sure that the replacement string has the same number of characters as the search string, otherwise the chromedriver will fail.



      Perl Explanation



      s///g denotes that you want to search for a string and replace it globally with another string (replaces all occurrences).




      e.g., s/string/replacment/g




      So,




      s/// denotes searching for and replacing a string.



      cdc_ is the search string.



      dog_ is the replacement string.



      g is the global key, which replaces every occurrence of the string.




      How to check if the Perl replacement worked



      The following line will print every occurrence of the search string cdc_:



      perl -ne 'while(/cdc_/g){print "$&n";}' /path/to/chromedriver



      If this returns nothing, then cdc_ has been replaced.



      Conversely, you can use the this:



      perl -ne 'while(/dog_/g){print "$&n";}' /path/to/chromedriver



      to see if your replacement string, dog_, is now in the chromedriver binary. If it is, the replacement string will be printed to the console.



      Go to the altered chromedriver and double click on it. A terminal window should open up. If you don't see killed in the output, you successfully altered the driver.





      Wrapping Up



      After altering the chromedriver binary, make sure that the name of the altered chromedriver binary is chromedriver, and that the original binary is either moved from its original location or renamed.





      My Experience With This Method



      I was previously being detected on a website while trying to log in, but after replacing cdc_ with an equal sized string, I was able to log in. Like others have said though, if you've already been detected, you might get blocked for a plethora of other reasons even after using this method. So you may have to try accessing the site that was detecting you using a VPN, different network, or what have you.






      share|improve this answer














      For Mac Users



      Replacing cdc_ variable using Vim or Perl



      You can use vim, or as @Vic Seedoubleyew has pointed out in the answer by @Erti-Chris Eelmaa, perl, to replace the cdc_ variable in chromedriver(See post by @Erti-Chris Eelmaa to learn more about that variable). Using vim or perl prevents you from having to recompile source code or use a hex-editor. Make sure to make a copy of the original chromedriver before attempting to edit it. Also, the methods below were tested on chromedriver version 2.41.578706.





      Using Vim



      vim /path/to/chromedriver


      After running the line above, you'll probably see a bunch of gibberish. Do the following:




      1. Search for cdc_ by typing /cdc_ and pressing return.

      2. Enable editing by pressing a.

      3. Delete any amount of $cdc_lasutopfhvcZLmcfl and replace what was deleted with an equal amount characters. If you don't, chromedriver will fail.

      4. After you're done editing, press esc.

      5. To save the changes and quit, type :wq! and press return.

      6. If you don't want to save the changes, but you want to quit, type :q! and press return.

      7. You're done.


      Go to the altered chromedriver and double click on it. A terminal window should open up. If you don't see killed in the output, you successfully altered the driver.





      Using Perl



      The line below replaces cdc_ with dog_:



      perl -pi -e 's/cdc_/dog_/g' /path/to/chromedriver


      Make sure that the replacement string has the same number of characters as the search string, otherwise the chromedriver will fail.



      Perl Explanation



      s///g denotes that you want to search for a string and replace it globally with another string (replaces all occurrences).




      e.g., s/string/replacment/g




      So,




      s/// denotes searching for and replacing a string.



      cdc_ is the search string.



      dog_ is the replacement string.



      g is the global key, which replaces every occurrence of the string.




      How to check if the Perl replacement worked



      The following line will print every occurrence of the search string cdc_:



      perl -ne 'while(/cdc_/g){print "$&n";}' /path/to/chromedriver



      If this returns nothing, then cdc_ has been replaced.



      Conversely, you can use the this:



      perl -ne 'while(/dog_/g){print "$&n";}' /path/to/chromedriver



      to see if your replacement string, dog_, is now in the chromedriver binary. If it is, the replacement string will be printed to the console.



      Go to the altered chromedriver and double click on it. A terminal window should open up. If you don't see killed in the output, you successfully altered the driver.





      Wrapping Up



      After altering the chromedriver binary, make sure that the name of the altered chromedriver binary is chromedriver, and that the original binary is either moved from its original location or renamed.





      My Experience With This Method



      I was previously being detected on a website while trying to log in, but after replacing cdc_ with an equal sized string, I was able to log in. Like others have said though, if you've already been detected, you might get blocked for a plethora of other reasons even after using this method. So you may have to try accessing the site that was detecting you using a VPN, different network, or what have you.







      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Nov 12 at 4:52

























      answered Aug 31 at 3:49









      colossatr0n

      155110




      155110












      • This looks like one of the easiest solutions if it works.
        – Ryan Weinstein
        Sep 14 at 0:07






      • 2




        this doesn't work with current chromedriver version
        – Leka Baper
        Nov 9 at 21:06










      • @LekaBaper Thanks for the heads up. The chromedriver version that I used was version 2.41.578706.
        – colossatr0n
        Nov 12 at 4:47










      • This doesn't work anymore. Checkout my answer...
        – ShayanKM
        Dec 5 at 13:02


















      • This looks like one of the easiest solutions if it works.
        – Ryan Weinstein
        Sep 14 at 0:07






      • 2




        this doesn't work with current chromedriver version
        – Leka Baper
        Nov 9 at 21:06










      • @LekaBaper Thanks for the heads up. The chromedriver version that I used was version 2.41.578706.
        – colossatr0n
        Nov 12 at 4:47










      • This doesn't work anymore. Checkout my answer...
        – ShayanKM
        Dec 5 at 13:02
















      This looks like one of the easiest solutions if it works.
      – Ryan Weinstein
      Sep 14 at 0:07




      This looks like one of the easiest solutions if it works.
      – Ryan Weinstein
      Sep 14 at 0:07




      2




      2




      this doesn't work with current chromedriver version
      – Leka Baper
      Nov 9 at 21:06




      this doesn't work with current chromedriver version
      – Leka Baper
      Nov 9 at 21:06












      @LekaBaper Thanks for the heads up. The chromedriver version that I used was version 2.41.578706.
      – colossatr0n
      Nov 12 at 4:47




      @LekaBaper Thanks for the heads up. The chromedriver version that I used was version 2.41.578706.
      – colossatr0n
      Nov 12 at 4:47












      This doesn't work anymore. Checkout my answer...
      – ShayanKM
      Dec 5 at 13:02




      This doesn't work anymore. Checkout my answer...
      – ShayanKM
      Dec 5 at 13:02












      up vote
      100
      down vote













      Basically the way the selenium detection works, is that they test for pre-defined javascript variables which appear when running with selenium. The bot detection scripts usually look anything containing word "selenium" / "webdriver" in any of the variables (on window object), and also document variables called $cdc_ and $wdc_. Of course, all of this depends on which browser you are on. All the different browsers expose different things.



      For me, I used chrome, so, all that I had to do was to ensure that $cdc_ didn't exist anymore as document variable, and voila (download chromedriver source code, modify chromedriver and re-compile $cdc_ under different name.)



      this is the function I modified in chromedriver:



      call_function.js:



      function getPageCache(opt_doc) {
      var doc = opt_doc || document;
      //var key = '$cdc_asdjflasutopfhvcZLmcfl_';
      var key = 'randomblabla_';
      if (!(key in doc))
      doc[key] = new Cache();
      return doc[key];
      }


      (note the comment, all I did I turned $cdc_ to randomblabla_.



      Here is a pseudo-code which demonstrates some of the techniques that bot networks might use:



      runBotDetection = function () {
      var documentDetectionKeys = [
      "__webdriver_evaluate",
      "__selenium_evaluate",
      "__webdriver_script_function",
      "__webdriver_script_func",
      "__webdriver_script_fn",
      "__fxdriver_evaluate",
      "__driver_unwrapped",
      "__webdriver_unwrapped",
      "__driver_evaluate",
      "__selenium_unwrapped",
      "__fxdriver_unwrapped",
      ];

      var windowDetectionKeys = [
      "_phantom",
      "__nightmare",
      "_selenium",
      "callPhantom",
      "callSelenium",
      "_Selenium_IDE_Recorder",
      ];

      for (const windowDetectionKey in windowDetectionKeys) {
      const windowDetectionKeyValue = windowDetectionKeys[windowDetectionKey];
      if (window[windowDetectionKeyValue]) {
      return true;
      }
      };
      for (const documentDetectionKey in documentDetectionKeys) {
      const documentDetectionKeyValue = documentDetectionKeys[documentDetectionKey];
      if (window['document'][documentDetectionKeyValue]) {
      return true;
      }
      };

      for (const documentKey in window['document']) {
      if (documentKey.match(/$[a-z]dc_/) && window['document'][documentKey]['cache_']) {
      return true;
      }
      }

      if (window['external'] && window['external'].toString() && (window['external'].toString()['indexOf']('Sequentum') != -1)) return true;

      if (window['document']['documentElement']['getAttribute']('selenium')) return true;
      if (window['document']['documentElement']['getAttribute']('webdriver')) return true;
      if (window['document']['documentElement']['getAttribute']('driver')) return true;

      return false;
      };


      according to user @szx, it is also possible to simply open chromedriver.exe in hex editor, and just do the replacement manually, without actually doing any compiling.






      share|improve this answer



















      • 17




        yes it worked without probs, note one problem is if you fell into the "blacklist" BEFORE this change, it's quite hard to get out. if you want to get out of the existing black list, you need to implement fake canvas fingerprinting, disable flash, change IP, and change request header order (swap language and Accept headers). Once you fell into the blacklist, they have very good measures to track you, even if you change IP, even if you open chrome in incognito, etc
        – Erti-Chris Eelmaa
        Dec 19 '16 at 19:23








      • 4




        This is very interesting, thanks for going to the trouble. Its been more than a year since I asked this question and its nice to finally have an answer. I'll see if I can pass you the check-mark once I've tested the solution myself.
        – Ryan Weinstein
        Dec 19 '16 at 22:41






      • 3




        What is the most straight forward way to compile chromedriver on Windows?
        – Arya
        Jun 23 '17 at 19:34






      • 2




        I found the file "/Users/your_username/chromium/src/chrome/test/chromedriver/js"
        – JonghoKim
        Jul 1 '17 at 9:08






      • 5




        I simply replaced $cdc with xxxx in chromedriver.exe in a hex editor and it worked! I also noticed that if you maximize the browser window (rather than use a predefined size) it's detected less often.
        – szx
        Feb 25 at 17:59

















      up vote
      100
      down vote













      Basically the way the selenium detection works, is that they test for pre-defined javascript variables which appear when running with selenium. The bot detection scripts usually look anything containing word "selenium" / "webdriver" in any of the variables (on window object), and also document variables called $cdc_ and $wdc_. Of course, all of this depends on which browser you are on. All the different browsers expose different things.



      For me, I used chrome, so, all that I had to do was to ensure that $cdc_ didn't exist anymore as document variable, and voila (download chromedriver source code, modify chromedriver and re-compile $cdc_ under different name.)



      this is the function I modified in chromedriver:



      call_function.js:



      function getPageCache(opt_doc) {
      var doc = opt_doc || document;
      //var key = '$cdc_asdjflasutopfhvcZLmcfl_';
      var key = 'randomblabla_';
      if (!(key in doc))
      doc[key] = new Cache();
      return doc[key];
      }


      (note the comment, all I did I turned $cdc_ to randomblabla_.



      Here is a pseudo-code which demonstrates some of the techniques that bot networks might use:



      runBotDetection = function () {
      var documentDetectionKeys = [
      "__webdriver_evaluate",
      "__selenium_evaluate",
      "__webdriver_script_function",
      "__webdriver_script_func",
      "__webdriver_script_fn",
      "__fxdriver_evaluate",
      "__driver_unwrapped",
      "__webdriver_unwrapped",
      "__driver_evaluate",
      "__selenium_unwrapped",
      "__fxdriver_unwrapped",
      ];

      var windowDetectionKeys = [
      "_phantom",
      "__nightmare",
      "_selenium",
      "callPhantom",
      "callSelenium",
      "_Selenium_IDE_Recorder",
      ];

      for (const windowDetectionKey in windowDetectionKeys) {
      const windowDetectionKeyValue = windowDetectionKeys[windowDetectionKey];
      if (window[windowDetectionKeyValue]) {
      return true;
      }
      };
      for (const documentDetectionKey in documentDetectionKeys) {
      const documentDetectionKeyValue = documentDetectionKeys[documentDetectionKey];
      if (window['document'][documentDetectionKeyValue]) {
      return true;
      }
      };

      for (const documentKey in window['document']) {
      if (documentKey.match(/$[a-z]dc_/) && window['document'][documentKey]['cache_']) {
      return true;
      }
      }

      if (window['external'] && window['external'].toString() && (window['external'].toString()['indexOf']('Sequentum') != -1)) return true;

      if (window['document']['documentElement']['getAttribute']('selenium')) return true;
      if (window['document']['documentElement']['getAttribute']('webdriver')) return true;
      if (window['document']['documentElement']['getAttribute']('driver')) return true;

      return false;
      };


      according to user @szx, it is also possible to simply open chromedriver.exe in hex editor, and just do the replacement manually, without actually doing any compiling.






      share|improve this answer



















      • 17




        yes it worked without probs, note one problem is if you fell into the "blacklist" BEFORE this change, it's quite hard to get out. if you want to get out of the existing black list, you need to implement fake canvas fingerprinting, disable flash, change IP, and change request header order (swap language and Accept headers). Once you fell into the blacklist, they have very good measures to track you, even if you change IP, even if you open chrome in incognito, etc
        – Erti-Chris Eelmaa
        Dec 19 '16 at 19:23








      • 4




        This is very interesting, thanks for going to the trouble. Its been more than a year since I asked this question and its nice to finally have an answer. I'll see if I can pass you the check-mark once I've tested the solution myself.
        – Ryan Weinstein
        Dec 19 '16 at 22:41






      • 3




        What is the most straight forward way to compile chromedriver on Windows?
        – Arya
        Jun 23 '17 at 19:34






      • 2




        I found the file "/Users/your_username/chromium/src/chrome/test/chromedriver/js"
        – JonghoKim
        Jul 1 '17 at 9:08






      • 5




        I simply replaced $cdc with xxxx in chromedriver.exe in a hex editor and it worked! I also noticed that if you maximize the browser window (rather than use a predefined size) it's detected less often.
        – szx
        Feb 25 at 17:59















      up vote
      100
      down vote










      up vote
      100
      down vote









      Basically the way the selenium detection works, is that they test for pre-defined javascript variables which appear when running with selenium. The bot detection scripts usually look anything containing word "selenium" / "webdriver" in any of the variables (on window object), and also document variables called $cdc_ and $wdc_. Of course, all of this depends on which browser you are on. All the different browsers expose different things.



      For me, I used chrome, so, all that I had to do was to ensure that $cdc_ didn't exist anymore as document variable, and voila (download chromedriver source code, modify chromedriver and re-compile $cdc_ under different name.)



      this is the function I modified in chromedriver:



      call_function.js:



      function getPageCache(opt_doc) {
      var doc = opt_doc || document;
      //var key = '$cdc_asdjflasutopfhvcZLmcfl_';
      var key = 'randomblabla_';
      if (!(key in doc))
      doc[key] = new Cache();
      return doc[key];
      }


      (note the comment, all I did I turned $cdc_ to randomblabla_.



      Here is a pseudo-code which demonstrates some of the techniques that bot networks might use:



      runBotDetection = function () {
      var documentDetectionKeys = [
      "__webdriver_evaluate",
      "__selenium_evaluate",
      "__webdriver_script_function",
      "__webdriver_script_func",
      "__webdriver_script_fn",
      "__fxdriver_evaluate",
      "__driver_unwrapped",
      "__webdriver_unwrapped",
      "__driver_evaluate",
      "__selenium_unwrapped",
      "__fxdriver_unwrapped",
      ];

      var windowDetectionKeys = [
      "_phantom",
      "__nightmare",
      "_selenium",
      "callPhantom",
      "callSelenium",
      "_Selenium_IDE_Recorder",
      ];

      for (const windowDetectionKey in windowDetectionKeys) {
      const windowDetectionKeyValue = windowDetectionKeys[windowDetectionKey];
      if (window[windowDetectionKeyValue]) {
      return true;
      }
      };
      for (const documentDetectionKey in documentDetectionKeys) {
      const documentDetectionKeyValue = documentDetectionKeys[documentDetectionKey];
      if (window['document'][documentDetectionKeyValue]) {
      return true;
      }
      };

      for (const documentKey in window['document']) {
      if (documentKey.match(/$[a-z]dc_/) && window['document'][documentKey]['cache_']) {
      return true;
      }
      }

      if (window['external'] && window['external'].toString() && (window['external'].toString()['indexOf']('Sequentum') != -1)) return true;

      if (window['document']['documentElement']['getAttribute']('selenium')) return true;
      if (window['document']['documentElement']['getAttribute']('webdriver')) return true;
      if (window['document']['documentElement']['getAttribute']('driver')) return true;

      return false;
      };


      according to user @szx, it is also possible to simply open chromedriver.exe in hex editor, and just do the replacement manually, without actually doing any compiling.






      share|improve this answer














      Basically the way the selenium detection works, is that they test for pre-defined javascript variables which appear when running with selenium. The bot detection scripts usually look anything containing word "selenium" / "webdriver" in any of the variables (on window object), and also document variables called $cdc_ and $wdc_. Of course, all of this depends on which browser you are on. All the different browsers expose different things.



      For me, I used chrome, so, all that I had to do was to ensure that $cdc_ didn't exist anymore as document variable, and voila (download chromedriver source code, modify chromedriver and re-compile $cdc_ under different name.)



      this is the function I modified in chromedriver:



      call_function.js:



      function getPageCache(opt_doc) {
      var doc = opt_doc || document;
      //var key = '$cdc_asdjflasutopfhvcZLmcfl_';
      var key = 'randomblabla_';
      if (!(key in doc))
      doc[key] = new Cache();
      return doc[key];
      }


      (note the comment, all I did I turned $cdc_ to randomblabla_.



      Here is a pseudo-code which demonstrates some of the techniques that bot networks might use:



      runBotDetection = function () {
      var documentDetectionKeys = [
      "__webdriver_evaluate",
      "__selenium_evaluate",
      "__webdriver_script_function",
      "__webdriver_script_func",
      "__webdriver_script_fn",
      "__fxdriver_evaluate",
      "__driver_unwrapped",
      "__webdriver_unwrapped",
      "__driver_evaluate",
      "__selenium_unwrapped",
      "__fxdriver_unwrapped",
      ];

      var windowDetectionKeys = [
      "_phantom",
      "__nightmare",
      "_selenium",
      "callPhantom",
      "callSelenium",
      "_Selenium_IDE_Recorder",
      ];

      for (const windowDetectionKey in windowDetectionKeys) {
      const windowDetectionKeyValue = windowDetectionKeys[windowDetectionKey];
      if (window[windowDetectionKeyValue]) {
      return true;
      }
      };
      for (const documentDetectionKey in documentDetectionKeys) {
      const documentDetectionKeyValue = documentDetectionKeys[documentDetectionKey];
      if (window['document'][documentDetectionKeyValue]) {
      return true;
      }
      };

      for (const documentKey in window['document']) {
      if (documentKey.match(/$[a-z]dc_/) && window['document'][documentKey]['cache_']) {
      return true;
      }
      }

      if (window['external'] && window['external'].toString() && (window['external'].toString()['indexOf']('Sequentum') != -1)) return true;

      if (window['document']['documentElement']['getAttribute']('selenium')) return true;
      if (window['document']['documentElement']['getAttribute']('webdriver')) return true;
      if (window['document']['documentElement']['getAttribute']('driver')) return true;

      return false;
      };


      according to user @szx, it is also possible to simply open chromedriver.exe in hex editor, and just do the replacement manually, without actually doing any compiling.







      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited May 11 at 17:22









      Grokify

      7,18822137




      7,18822137










      answered Dec 19 '16 at 10:14









      Erti-Chris Eelmaa

      19.4k43865




      19.4k43865








      • 17




        yes it worked without probs, note one problem is if you fell into the "blacklist" BEFORE this change, it's quite hard to get out. if you want to get out of the existing black list, you need to implement fake canvas fingerprinting, disable flash, change IP, and change request header order (swap language and Accept headers). Once you fell into the blacklist, they have very good measures to track you, even if you change IP, even if you open chrome in incognito, etc
        – Erti-Chris Eelmaa
        Dec 19 '16 at 19:23








      • 4




        This is very interesting, thanks for going to the trouble. Its been more than a year since I asked this question and its nice to finally have an answer. I'll see if I can pass you the check-mark once I've tested the solution myself.
        – Ryan Weinstein
        Dec 19 '16 at 22:41






      • 3




        What is the most straight forward way to compile chromedriver on Windows?
        – Arya
        Jun 23 '17 at 19:34






      • 2




        I found the file "/Users/your_username/chromium/src/chrome/test/chromedriver/js"
        – JonghoKim
        Jul 1 '17 at 9:08






      • 5




        I simply replaced $cdc with xxxx in chromedriver.exe in a hex editor and it worked! I also noticed that if you maximize the browser window (rather than use a predefined size) it's detected less often.
        – szx
        Feb 25 at 17:59
















      • 17




        yes it worked without probs, note one problem is if you fell into the "blacklist" BEFORE this change, it's quite hard to get out. if you want to get out of the existing black list, you need to implement fake canvas fingerprinting, disable flash, change IP, and change request header order (swap language and Accept headers). Once you fell into the blacklist, they have very good measures to track you, even if you change IP, even if you open chrome in incognito, etc
        – Erti-Chris Eelmaa
        Dec 19 '16 at 19:23








      • 4




        This is very interesting, thanks for going to the trouble. Its been more than a year since I asked this question and its nice to finally have an answer. I'll see if I can pass you the check-mark once I've tested the solution myself.
        – Ryan Weinstein
        Dec 19 '16 at 22:41






      • 3




        What is the most straight forward way to compile chromedriver on Windows?
        – Arya
        Jun 23 '17 at 19:34






      • 2




        I found the file "/Users/your_username/chromium/src/chrome/test/chromedriver/js"
        – JonghoKim
        Jul 1 '17 at 9:08






      • 5




        I simply replaced $cdc with xxxx in chromedriver.exe in a hex editor and it worked! I also noticed that if you maximize the browser window (rather than use a predefined size) it's detected less often.
        – szx
        Feb 25 at 17:59










      17




      17




      yes it worked without probs, note one problem is if you fell into the "blacklist" BEFORE this change, it's quite hard to get out. if you want to get out of the existing black list, you need to implement fake canvas fingerprinting, disable flash, change IP, and change request header order (swap language and Accept headers). Once you fell into the blacklist, they have very good measures to track you, even if you change IP, even if you open chrome in incognito, etc
      – Erti-Chris Eelmaa
      Dec 19 '16 at 19:23






      yes it worked without probs, note one problem is if you fell into the "blacklist" BEFORE this change, it's quite hard to get out. if you want to get out of the existing black list, you need to implement fake canvas fingerprinting, disable flash, change IP, and change request header order (swap language and Accept headers). Once you fell into the blacklist, they have very good measures to track you, even if you change IP, even if you open chrome in incognito, etc
      – Erti-Chris Eelmaa
      Dec 19 '16 at 19:23






      4




      4




      This is very interesting, thanks for going to the trouble. Its been more than a year since I asked this question and its nice to finally have an answer. I'll see if I can pass you the check-mark once I've tested the solution myself.
      – Ryan Weinstein
      Dec 19 '16 at 22:41




      This is very interesting, thanks for going to the trouble. Its been more than a year since I asked this question and its nice to finally have an answer. I'll see if I can pass you the check-mark once I've tested the solution myself.
      – Ryan Weinstein
      Dec 19 '16 at 22:41




      3




      3




      What is the most straight forward way to compile chromedriver on Windows?
      – Arya
      Jun 23 '17 at 19:34




      What is the most straight forward way to compile chromedriver on Windows?
      – Arya
      Jun 23 '17 at 19:34




      2




      2




      I found the file "/Users/your_username/chromium/src/chrome/test/chromedriver/js"
      – JonghoKim
      Jul 1 '17 at 9:08




      I found the file "/Users/your_username/chromium/src/chrome/test/chromedriver/js"
      – JonghoKim
      Jul 1 '17 at 9:08




      5




      5




      I simply replaced $cdc with xxxx in chromedriver.exe in a hex editor and it worked! I also noticed that if you maximize the browser window (rather than use a predefined size) it's detected less often.
      – szx
      Feb 25 at 17:59






      I simply replaced $cdc with xxxx in chromedriver.exe in a hex editor and it worked! I also noticed that if you maximize the browser window (rather than use a predefined size) it's detected less often.
      – szx
      Feb 25 at 17:59












      up vote
      68
      down vote



      +100










      As we've already figured out in the question and the posted answers, there is an anti Web-scraping and a Bot detection service called "Distil Networks" in play here. And, according to the company CEO's interview:




      Even though they can create new bots, we figured out a way to identify
      Selenium the a tool they’re using, so we’re blocking Selenium no
      matter how many times they iterate on that bot
      . We’re doing that now
      with Python and a lot of different technologies. Once we see a pattern
      emerge from one type of bot, then we work to reverse engineer the
      technology they use and identify it as malicious.




      It'll take time and additional challenges to understand how exactly they are detecting Selenium, but what can we say for sure at the moment:




      • it's not related to the actions you take with selenium - once you navigate to the site, you get immediately detected and banned. I've tried to add artificial random delays between actions, take a pause after the page is loaded - nothing helped

      • it's not about browser fingerprint either - tried it in multiple browsers with clean profiles and not, incognito modes - nothing helped

      • since, according to the hint in the interview, this was "reverse engineering", I suspect this is done with some JS code being executed in the browser revealing that this is a browser automated via selenium webdriver


      Decided to post it as an answer, since clearly:




      Can a website detect when you are using selenium with chromedriver?




      Yes.





      Also, what I haven't experimented with is older selenium and older browser versions - in theory, there could be something implemented/added to selenium at a certain point that Distil Networks bot detector currently relies on. Then, if this is the case, we might detect (yeah, let's detect the detector) at what point/version a relevant change was made, look into changelog and changesets and, may be, this could give us more information on where to look and what is it they use to detect a webdriver-powered browser. It's just a theory that needs to be tested.






      share|improve this answer























      • This is crazy. So they really have a way of detecting it that no one else has. I really want to figure out how they're doing it. Can you provide any other information at all as to how they could possibly be doing it?
        – Ryan Weinstein
        Oct 29 '15 at 20:50










      • @RyanWeinstein well, we have no actual proof and we can only speculate and test. For now, I would say they have a way to detect us using selenium. Try experimenting with selenium versions - this may give you some clues.
        – alecxe
        Oct 29 '15 at 22:19






      • 1




        Could it have to do with how ephemeral ports are determined? The method stays away from well-known ranges. github.com/SeleniumHQ/selenium/blob/…
        – Elliott
        Jan 12 '16 at 22:12








      • 6




        Easyjet are using distilnetwork service, yeah it can block dummy bots but not the complicated ones because we have tested it with more than 2000 requests a day from different IPs (which we re-use again 'same' address) so basicly each IP go for a 5-10 requests a day and from this I can tell that all this bot detecting services are just there to develop and sell some 45% working algorithmes, the scrapper we used was easy to detect I can block it while destilnetworks, squareshield and others couldn't which pushed me to never use any of them.
        – Jeffery ThaGintoki
        Feb 20 '17 at 18:16






      • 2




        @alecxe Any updates on this? Still can't use Selenium with Distil?
        – Utku
        May 21 at 15:17















      up vote
      68
      down vote



      +100










      As we've already figured out in the question and the posted answers, there is an anti Web-scraping and a Bot detection service called "Distil Networks" in play here. And, according to the company CEO's interview:




      Even though they can create new bots, we figured out a way to identify
      Selenium the a tool they’re using, so we’re blocking Selenium no
      matter how many times they iterate on that bot
      . We’re doing that now
      with Python and a lot of different technologies. Once we see a pattern
      emerge from one type of bot, then we work to reverse engineer the
      technology they use and identify it as malicious.




      It'll take time and additional challenges to understand how exactly they are detecting Selenium, but what can we say for sure at the moment:




      • it's not related to the actions you take with selenium - once you navigate to the site, you get immediately detected and banned. I've tried to add artificial random delays between actions, take a pause after the page is loaded - nothing helped

      • it's not about browser fingerprint either - tried it in multiple browsers with clean profiles and not, incognito modes - nothing helped

      • since, according to the hint in the interview, this was "reverse engineering", I suspect this is done with some JS code being executed in the browser revealing that this is a browser automated via selenium webdriver


      Decided to post it as an answer, since clearly:




      Can a website detect when you are using selenium with chromedriver?




      Yes.





      Also, what I haven't experimented with is older selenium and older browser versions - in theory, there could be something implemented/added to selenium at a certain point that Distil Networks bot detector currently relies on. Then, if this is the case, we might detect (yeah, let's detect the detector) at what point/version a relevant change was made, look into changelog and changesets and, may be, this could give us more information on where to look and what is it they use to detect a webdriver-powered browser. It's just a theory that needs to be tested.






      share|improve this answer























      • This is crazy. So they really have a way of detecting it that no one else has. I really want to figure out how they're doing it. Can you provide any other information at all as to how they could possibly be doing it?
        – Ryan Weinstein
        Oct 29 '15 at 20:50










      • @RyanWeinstein well, we have no actual proof and we can only speculate and test. For now, I would say they have a way to detect us using selenium. Try experimenting with selenium versions - this may give you some clues.
        – alecxe
        Oct 29 '15 at 22:19






      • 1




        Could it have to do with how ephemeral ports are determined? The method stays away from well-known ranges. github.com/SeleniumHQ/selenium/blob/…
        – Elliott
        Jan 12 '16 at 22:12








      • 6




        Easyjet are using distilnetwork service, yeah it can block dummy bots but not the complicated ones because we have tested it with more than 2000 requests a day from different IPs (which we re-use again 'same' address) so basicly each IP go for a 5-10 requests a day and from this I can tell that all this bot detecting services are just there to develop and sell some 45% working algorithmes, the scrapper we used was easy to detect I can block it while destilnetworks, squareshield and others couldn't which pushed me to never use any of them.
        – Jeffery ThaGintoki
        Feb 20 '17 at 18:16






      • 2




        @alecxe Any updates on this? Still can't use Selenium with Distil?
        – Utku
        May 21 at 15:17













      up vote
      68
      down vote



      +100







      up vote
      68
      down vote



      +100




      +100




      As we've already figured out in the question and the posted answers, there is an anti Web-scraping and a Bot detection service called "Distil Networks" in play here. And, according to the company CEO's interview:




      Even though they can create new bots, we figured out a way to identify
      Selenium the a tool they’re using, so we’re blocking Selenium no
      matter how many times they iterate on that bot
      . We’re doing that now
      with Python and a lot of different technologies. Once we see a pattern
      emerge from one type of bot, then we work to reverse engineer the
      technology they use and identify it as malicious.




      It'll take time and additional challenges to understand how exactly they are detecting Selenium, but what can we say for sure at the moment:




      • it's not related to the actions you take with selenium - once you navigate to the site, you get immediately detected and banned. I've tried to add artificial random delays between actions, take a pause after the page is loaded - nothing helped

      • it's not about browser fingerprint either - tried it in multiple browsers with clean profiles and not, incognito modes - nothing helped

      • since, according to the hint in the interview, this was "reverse engineering", I suspect this is done with some JS code being executed in the browser revealing that this is a browser automated via selenium webdriver


      Decided to post it as an answer, since clearly:




      Can a website detect when you are using selenium with chromedriver?




      Yes.





      Also, what I haven't experimented with is older selenium and older browser versions - in theory, there could be something implemented/added to selenium at a certain point that Distil Networks bot detector currently relies on. Then, if this is the case, we might detect (yeah, let's detect the detector) at what point/version a relevant change was made, look into changelog and changesets and, may be, this could give us more information on where to look and what is it they use to detect a webdriver-powered browser. It's just a theory that needs to be tested.






      share|improve this answer














      As we've already figured out in the question and the posted answers, there is an anti Web-scraping and a Bot detection service called "Distil Networks" in play here. And, according to the company CEO's interview:




      Even though they can create new bots, we figured out a way to identify
      Selenium the a tool they’re using, so we’re blocking Selenium no
      matter how many times they iterate on that bot
      . We’re doing that now
      with Python and a lot of different technologies. Once we see a pattern
      emerge from one type of bot, then we work to reverse engineer the
      technology they use and identify it as malicious.




      It'll take time and additional challenges to understand how exactly they are detecting Selenium, but what can we say for sure at the moment:




      • it's not related to the actions you take with selenium - once you navigate to the site, you get immediately detected and banned. I've tried to add artificial random delays between actions, take a pause after the page is loaded - nothing helped

      • it's not about browser fingerprint either - tried it in multiple browsers with clean profiles and not, incognito modes - nothing helped

      • since, according to the hint in the interview, this was "reverse engineering", I suspect this is done with some JS code being executed in the browser revealing that this is a browser automated via selenium webdriver


      Decided to post it as an answer, since clearly:




      Can a website detect when you are using selenium with chromedriver?




      Yes.





      Also, what I haven't experimented with is older selenium and older browser versions - in theory, there could be something implemented/added to selenium at a certain point that Distil Networks bot detector currently relies on. Then, if this is the case, we might detect (yeah, let's detect the detector) at what point/version a relevant change was made, look into changelog and changesets and, may be, this could give us more information on where to look and what is it they use to detect a webdriver-powered browser. It's just a theory that needs to be tested.







      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Oct 29 '15 at 0:33

























      answered Oct 28 '15 at 23:39









      alecxe

      318k63604830




      318k63604830












      • This is crazy. So they really have a way of detecting it that no one else has. I really want to figure out how they're doing it. Can you provide any other information at all as to how they could possibly be doing it?
        – Ryan Weinstein
        Oct 29 '15 at 20:50










      • @RyanWeinstein well, we have no actual proof and we can only speculate and test. For now, I would say they have a way to detect us using selenium. Try experimenting with selenium versions - this may give you some clues.
        – alecxe
        Oct 29 '15 at 22:19






      • 1




        Could it have to do with how ephemeral ports are determined? The method stays away from well-known ranges. github.com/SeleniumHQ/selenium/blob/…
        – Elliott
        Jan 12 '16 at 22:12








      • 6




        Easyjet are using distilnetwork service, yeah it can block dummy bots but not the complicated ones because we have tested it with more than 2000 requests a day from different IPs (which we re-use again 'same' address) so basicly each IP go for a 5-10 requests a day and from this I can tell that all this bot detecting services are just there to develop and sell some 45% working algorithmes, the scrapper we used was easy to detect I can block it while destilnetworks, squareshield and others couldn't which pushed me to never use any of them.
        – Jeffery ThaGintoki
        Feb 20 '17 at 18:16






      • 2




        @alecxe Any updates on this? Still can't use Selenium with Distil?
        – Utku
        May 21 at 15:17


















      • This is crazy. So they really have a way of detecting it that no one else has. I really want to figure out how they're doing it. Can you provide any other information at all as to how they could possibly be doing it?
        – Ryan Weinstein
        Oct 29 '15 at 20:50










      • @RyanWeinstein well, we have no actual proof and we can only speculate and test. For now, I would say they have a way to detect us using selenium. Try experimenting with selenium versions - this may give you some clues.
        – alecxe
        Oct 29 '15 at 22:19






      • 1




        Could it have to do with how ephemeral ports are determined? The method stays away from well-known ranges. github.com/SeleniumHQ/selenium/blob/…
        – Elliott
        Jan 12 '16 at 22:12








      • 6




        Easyjet are using distilnetwork service, yeah it can block dummy bots but not the complicated ones because we have tested it with more than 2000 requests a day from different IPs (which we re-use again 'same' address) so basicly each IP go for a 5-10 requests a day and from this I can tell that all this bot detecting services are just there to develop and sell some 45% working algorithmes, the scrapper we used was easy to detect I can block it while destilnetworks, squareshield and others couldn't which pushed me to never use any of them.
        – Jeffery ThaGintoki
        Feb 20 '17 at 18:16






      • 2




        @alecxe Any updates on this? Still can't use Selenium with Distil?
        – Utku
        May 21 at 15:17
















      This is crazy. So they really have a way of detecting it that no one else has. I really want to figure out how they're doing it. Can you provide any other information at all as to how they could possibly be doing it?
      – Ryan Weinstein
      Oct 29 '15 at 20:50




      This is crazy. So they really have a way of detecting it that no one else has. I really want to figure out how they're doing it. Can you provide any other information at all as to how they could possibly be doing it?
      – Ryan Weinstein
      Oct 29 '15 at 20:50












      @RyanWeinstein well, we have no actual proof and we can only speculate and test. For now, I would say they have a way to detect us using selenium. Try experimenting with selenium versions - this may give you some clues.
      – alecxe
      Oct 29 '15 at 22:19




      @RyanWeinstein well, we have no actual proof and we can only speculate and test. For now, I would say they have a way to detect us using selenium. Try experimenting with selenium versions - this may give you some clues.
      – alecxe
      Oct 29 '15 at 22:19




      1




      1




      Could it have to do with how ephemeral ports are determined? The method stays away from well-known ranges. github.com/SeleniumHQ/selenium/blob/…
      – Elliott
      Jan 12 '16 at 22:12






      Could it have to do with how ephemeral ports are determined? The method stays away from well-known ranges. github.com/SeleniumHQ/selenium/blob/…
      – Elliott
      Jan 12 '16 at 22:12






      6




      6




      Easyjet are using distilnetwork service, yeah it can block dummy bots but not the complicated ones because we have tested it with more than 2000 requests a day from different IPs (which we re-use again 'same' address) so basicly each IP go for a 5-10 requests a day and from this I can tell that all this bot detecting services are just there to develop and sell some 45% working algorithmes, the scrapper we used was easy to detect I can block it while destilnetworks, squareshield and others couldn't which pushed me to never use any of them.
      – Jeffery ThaGintoki
      Feb 20 '17 at 18:16




      Easyjet are using distilnetwork service, yeah it can block dummy bots but not the complicated ones because we have tested it with more than 2000 requests a day from different IPs (which we re-use again 'same' address) so basicly each IP go for a 5-10 requests a day and from this I can tell that all this bot detecting services are just there to develop and sell some 45% working algorithmes, the scrapper we used was easy to detect I can block it while destilnetworks, squareshield and others couldn't which pushed me to never use any of them.
      – Jeffery ThaGintoki
      Feb 20 '17 at 18:16




      2




      2




      @alecxe Any updates on this? Still can't use Selenium with Distil?
      – Utku
      May 21 at 15:17




      @alecxe Any updates on this? Still can't use Selenium with Distil?
      – Utku
      May 21 at 15:17










      up vote
      18
      down vote













      Example of how it's implemented on wellsfargo.com:



      try {
      if (window.document.documentElement.getAttribute("webdriver")) return !+
      } catch (IDLMrxxel) {}
      try {
      if ("_Selenium_IDE_Recorder" in window) return !+""
      } catch (KknKsUayS) {}
      try {
      if ("__webdriver_script_fn" in document) return !+""





      share|improve this answer



















      • 2




        why is the last try not closed ? besides can u explain your answer a little.
        – ishandutta2007
        Aug 22 at 16:42















      up vote
      18
      down vote













      Example of how it's implemented on wellsfargo.com:



      try {
      if (window.document.documentElement.getAttribute("webdriver")) return !+
      } catch (IDLMrxxel) {}
      try {
      if ("_Selenium_IDE_Recorder" in window) return !+""
      } catch (KknKsUayS) {}
      try {
      if ("__webdriver_script_fn" in document) return !+""





      share|improve this answer



















      • 2




        why is the last try not closed ? besides can u explain your answer a little.
        – ishandutta2007
        Aug 22 at 16:42













      up vote
      18
      down vote










      up vote
      18
      down vote









      Example of how it's implemented on wellsfargo.com:



      try {
      if (window.document.documentElement.getAttribute("webdriver")) return !+
      } catch (IDLMrxxel) {}
      try {
      if ("_Selenium_IDE_Recorder" in window) return !+""
      } catch (KknKsUayS) {}
      try {
      if ("__webdriver_script_fn" in document) return !+""





      share|improve this answer














      Example of how it's implemented on wellsfargo.com:



      try {
      if (window.document.documentElement.getAttribute("webdriver")) return !+
      } catch (IDLMrxxel) {}
      try {
      if ("_Selenium_IDE_Recorder" in window) return !+""
      } catch (KknKsUayS) {}
      try {
      if ("__webdriver_script_fn" in document) return !+""






      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Oct 11 '17 at 10:18









      Shubham Jain

      7,49352860




      7,49352860










      answered Sep 11 '16 at 23:21









      aianitro

      45469




      45469








      • 2




        why is the last try not closed ? besides can u explain your answer a little.
        – ishandutta2007
        Aug 22 at 16:42














      • 2




        why is the last try not closed ? besides can u explain your answer a little.
        – ishandutta2007
        Aug 22 at 16:42








      2




      2




      why is the last try not closed ? besides can u explain your answer a little.
      – ishandutta2007
      Aug 22 at 16:42




      why is the last try not closed ? besides can u explain your answer a little.
      – ishandutta2007
      Aug 22 at 16:42










      up vote
      8
      down vote













      Try to use selenium with a specific user profile of chrome, That way you can use it as specific user and define any thing you want, When doing so it will run as a 'real' user, look at chrome process with some process explorer and you'll see the difference with the tags.



      For example:



      username = os.getenv("USERNAME")
      userProfile = "C:\Users\" + username + "\AppData\Local\Google\Chrome\User Data\Default"
      options = webdriver.ChromeOptions()
      options.add_argument("user-data-dir={}".format(userProfile))
      # add here any tag you want.
      options.add_experimental_option("excludeSwitches", ["ignore-certificate-errors", "safebrowsing-disable-download-protection", "safebrowsing-disable-auto-update", "disable-client-side-phishing-detection"])
      chromedriver = "C:Python27chromedriverchromedriver.exe"
      os.environ["webdriver.chrome.driver"] = chromedriver
      browser = webdriver.Chrome(executable_path=chromedriver, chrome_options=options)


      chrome tag list here






      share|improve this answer

























        up vote
        8
        down vote













        Try to use selenium with a specific user profile of chrome, That way you can use it as specific user and define any thing you want, When doing so it will run as a 'real' user, look at chrome process with some process explorer and you'll see the difference with the tags.



        For example:



        username = os.getenv("USERNAME")
        userProfile = "C:\Users\" + username + "\AppData\Local\Google\Chrome\User Data\Default"
        options = webdriver.ChromeOptions()
        options.add_argument("user-data-dir={}".format(userProfile))
        # add here any tag you want.
        options.add_experimental_option("excludeSwitches", ["ignore-certificate-errors", "safebrowsing-disable-download-protection", "safebrowsing-disable-auto-update", "disable-client-side-phishing-detection"])
        chromedriver = "C:Python27chromedriverchromedriver.exe"
        os.environ["webdriver.chrome.driver"] = chromedriver
        browser = webdriver.Chrome(executable_path=chromedriver, chrome_options=options)


        chrome tag list here






        share|improve this answer























          up vote
          8
          down vote










          up vote
          8
          down vote









          Try to use selenium with a specific user profile of chrome, That way you can use it as specific user and define any thing you want, When doing so it will run as a 'real' user, look at chrome process with some process explorer and you'll see the difference with the tags.



          For example:



          username = os.getenv("USERNAME")
          userProfile = "C:\Users\" + username + "\AppData\Local\Google\Chrome\User Data\Default"
          options = webdriver.ChromeOptions()
          options.add_argument("user-data-dir={}".format(userProfile))
          # add here any tag you want.
          options.add_experimental_option("excludeSwitches", ["ignore-certificate-errors", "safebrowsing-disable-download-protection", "safebrowsing-disable-auto-update", "disable-client-side-phishing-detection"])
          chromedriver = "C:Python27chromedriverchromedriver.exe"
          os.environ["webdriver.chrome.driver"] = chromedriver
          browser = webdriver.Chrome(executable_path=chromedriver, chrome_options=options)


          chrome tag list here






          share|improve this answer












          Try to use selenium with a specific user profile of chrome, That way you can use it as specific user and define any thing you want, When doing so it will run as a 'real' user, look at chrome process with some process explorer and you'll see the difference with the tags.



          For example:



          username = os.getenv("USERNAME")
          userProfile = "C:\Users\" + username + "\AppData\Local\Google\Chrome\User Data\Default"
          options = webdriver.ChromeOptions()
          options.add_argument("user-data-dir={}".format(userProfile))
          # add here any tag you want.
          options.add_experimental_option("excludeSwitches", ["ignore-certificate-errors", "safebrowsing-disable-download-protection", "safebrowsing-disable-auto-update", "disable-client-side-phishing-detection"])
          chromedriver = "C:Python27chromedriverchromedriver.exe"
          os.environ["webdriver.chrome.driver"] = chromedriver
          browser = webdriver.Chrome(executable_path=chromedriver, chrome_options=options)


          chrome tag list here







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Oct 28 '15 at 16:39









          Kobi K

          4,70922660




          4,70922660






















              up vote
              7
              down vote














              partial interface Navigator {
              readonly attribute boolean webdriver;
              };



              The webdriver IDL attribute of the Navigator interface must return the value of the webdriver-active flag, which is initially false.



              This property allows websites to determine that the user agent is under control by WebDriver, and can be used to help mitigate denial-of-service attacks.




              Taken directly from the 2017 W3C Editor's Draft of WebDriver. This heavily implies that at the very least, future iterations of selenium's drivers will be identifiable to prevent misuse. Ultimately, it's hard to tell without the source code, what exactly causes chrome driver in specific to be detectable.






              share|improve this answer

















              • 4




                "it's hard to tell without the source code" .. well the source code is freely available
                – Corey Goldberg
                Nov 27 '17 at 16:08






              • 2




                I meant without the website in question's source code. It's hard to tell what they are checking against.
                – bryce
                Mar 19 at 21:12















              up vote
              7
              down vote














              partial interface Navigator {
              readonly attribute boolean webdriver;
              };



              The webdriver IDL attribute of the Navigator interface must return the value of the webdriver-active flag, which is initially false.



              This property allows websites to determine that the user agent is under control by WebDriver, and can be used to help mitigate denial-of-service attacks.




              Taken directly from the 2017 W3C Editor's Draft of WebDriver. This heavily implies that at the very least, future iterations of selenium's drivers will be identifiable to prevent misuse. Ultimately, it's hard to tell without the source code, what exactly causes chrome driver in specific to be detectable.






              share|improve this answer

















              • 4




                "it's hard to tell without the source code" .. well the source code is freely available
                – Corey Goldberg
                Nov 27 '17 at 16:08






              • 2




                I meant without the website in question's source code. It's hard to tell what they are checking against.
                – bryce
                Mar 19 at 21:12













              up vote
              7
              down vote










              up vote
              7
              down vote










              partial interface Navigator {
              readonly attribute boolean webdriver;
              };



              The webdriver IDL attribute of the Navigator interface must return the value of the webdriver-active flag, which is initially false.



              This property allows websites to determine that the user agent is under control by WebDriver, and can be used to help mitigate denial-of-service attacks.




              Taken directly from the 2017 W3C Editor's Draft of WebDriver. This heavily implies that at the very least, future iterations of selenium's drivers will be identifiable to prevent misuse. Ultimately, it's hard to tell without the source code, what exactly causes chrome driver in specific to be detectable.






              share|improve this answer













              partial interface Navigator {
              readonly attribute boolean webdriver;
              };



              The webdriver IDL attribute of the Navigator interface must return the value of the webdriver-active flag, which is initially false.



              This property allows websites to determine that the user agent is under control by WebDriver, and can be used to help mitigate denial-of-service attacks.




              Taken directly from the 2017 W3C Editor's Draft of WebDriver. This heavily implies that at the very least, future iterations of selenium's drivers will be identifiable to prevent misuse. Ultimately, it's hard to tell without the source code, what exactly causes chrome driver in specific to be detectable.







              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Jan 27 '17 at 23:05









              bryce

              12016




              12016








              • 4




                "it's hard to tell without the source code" .. well the source code is freely available
                – Corey Goldberg
                Nov 27 '17 at 16:08






              • 2




                I meant without the website in question's source code. It's hard to tell what they are checking against.
                – bryce
                Mar 19 at 21:12














              • 4




                "it's hard to tell without the source code" .. well the source code is freely available
                – Corey Goldberg
                Nov 27 '17 at 16:08






              • 2




                I meant without the website in question's source code. It's hard to tell what they are checking against.
                – bryce
                Mar 19 at 21:12








              4




              4




              "it's hard to tell without the source code" .. well the source code is freely available
              – Corey Goldberg
              Nov 27 '17 at 16:08




              "it's hard to tell without the source code" .. well the source code is freely available
              – Corey Goldberg
              Nov 27 '17 at 16:08




              2




              2




              I meant without the website in question's source code. It's hard to tell what they are checking against.
              – bryce
              Mar 19 at 21:12




              I meant without the website in question's source code. It's hard to tell what they are checking against.
              – bryce
              Mar 19 at 21:12










              up vote
              5
              down vote













              Even if you are sending all the right data (e.g. Selenium doesn't show up as an extension, you have a reasonable resolution/bit-depth, &c), there are a number of services and tools which profile visitor behaviour to determine whether the actor is a user or an automated system.



              For example, visiting a site then immediately going to perform some action by moving the mouse directly to the relevant button, in less than a second, is something no user would actually do.



              It might also be useful as a debugging tool to use a site such as https://panopticlick.eff.org/ to check how unique your browser is; it'll also help you verify whether there are any specific parameters that indicate you're running in Selenium.






              share|improve this answer

















              • 1




                I've already used that website and the fingerprint is identical to my normal browser. Also I'm not automating anything. I'm just browsing as normal.
                – Ryan Weinstein
                Oct 26 '15 at 4:46















              up vote
              5
              down vote













              Even if you are sending all the right data (e.g. Selenium doesn't show up as an extension, you have a reasonable resolution/bit-depth, &c), there are a number of services and tools which profile visitor behaviour to determine whether the actor is a user or an automated system.



              For example, visiting a site then immediately going to perform some action by moving the mouse directly to the relevant button, in less than a second, is something no user would actually do.



              It might also be useful as a debugging tool to use a site such as https://panopticlick.eff.org/ to check how unique your browser is; it'll also help you verify whether there are any specific parameters that indicate you're running in Selenium.






              share|improve this answer

















              • 1




                I've already used that website and the fingerprint is identical to my normal browser. Also I'm not automating anything. I'm just browsing as normal.
                – Ryan Weinstein
                Oct 26 '15 at 4:46













              up vote
              5
              down vote










              up vote
              5
              down vote









              Even if you are sending all the right data (e.g. Selenium doesn't show up as an extension, you have a reasonable resolution/bit-depth, &c), there are a number of services and tools which profile visitor behaviour to determine whether the actor is a user or an automated system.



              For example, visiting a site then immediately going to perform some action by moving the mouse directly to the relevant button, in less than a second, is something no user would actually do.



              It might also be useful as a debugging tool to use a site such as https://panopticlick.eff.org/ to check how unique your browser is; it'll also help you verify whether there are any specific parameters that indicate you're running in Selenium.






              share|improve this answer












              Even if you are sending all the right data (e.g. Selenium doesn't show up as an extension, you have a reasonable resolution/bit-depth, &c), there are a number of services and tools which profile visitor behaviour to determine whether the actor is a user or an automated system.



              For example, visiting a site then immediately going to perform some action by moving the mouse directly to the relevant button, in less than a second, is something no user would actually do.



              It might also be useful as a debugging tool to use a site such as https://panopticlick.eff.org/ to check how unique your browser is; it'll also help you verify whether there are any specific parameters that indicate you're running in Selenium.







              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Oct 25 '15 at 22:01









              lfaraone

              17k154465




              17k154465








              • 1




                I've already used that website and the fingerprint is identical to my normal browser. Also I'm not automating anything. I'm just browsing as normal.
                – Ryan Weinstein
                Oct 26 '15 at 4:46














              • 1




                I've already used that website and the fingerprint is identical to my normal browser. Also I'm not automating anything. I'm just browsing as normal.
                – Ryan Weinstein
                Oct 26 '15 at 4:46








              1




              1




              I've already used that website and the fingerprint is identical to my normal browser. Also I'm not automating anything. I'm just browsing as normal.
              – Ryan Weinstein
              Oct 26 '15 at 4:46




              I've already used that website and the fingerprint is identical to my normal browser. Also I'm not automating anything. I'm just browsing as normal.
              – Ryan Weinstein
              Oct 26 '15 at 4:46










              up vote
              4
              down vote













              It sounds like they are behind a web application firewall. Take a look at modsecurity and owasp to see how those work. In reality, what you are asking is how to do bot detection evasion. That is not what selenium web driver is for. It is for testing your web application not hitting other web applications. It is possible, but basically, you'd have to look at what a WAF looks for in their rule set and specifically avoid it with selenium if you can. Even then, it might still not work because you don't know what WAF they are using. You did the right first step, that is faking the user agent. If that didn't work though, then a WAF is in place and you probably need to get more tricky.



              Edit:
              Point taken from other answer. Make sure your user agent is actually being set correctly first. Maybe have it hit a local web server or sniff the traffic going out.






              share|improve this answer























              • I think you are on the correct path. I tested with my setup and replaced the User Agent with a valid user agent string that successfully went through and received the same result, stubhub blocked the request.
                – Brian Cain
                Oct 23 '15 at 23:36










              • Okay, if the user agent is fine, then they have attack detection in place for sure. WAF is a good place to start. Not that I'm condoning hitting other websites. I'm just answering in the name of science and advancement of human knowledge.
                – Bassel Samman
                Oct 23 '15 at 23:49






              • 1




                If it was an HTTP header issue then wouldn't the normal browser get blocked? The HTTP headers are exactly the same. Also what exactly am I looking at with that github link? Have you tried using selenium to go on stubhub? Something is very very off.
                – Ryan Weinstein
                Oct 26 '15 at 21:15






              • 1




                I'm sorry for the confusion. I'll look into that and you don't have to help me anymore if you don't want to. Most of my experience is in programming systems applications, so I was not familiar with these modsecurity rules that you're talking about. I'll take a look and try to educate myself. I'm not trying to bypass anything, I was just interested in knowing how these websites detect a user using selenium.
                – Ryan Weinstein
                Oct 27 '15 at 18:49






              • 1




                I'm a developer too :). Learning is a cause I can get behind. I don't mind helping, I just wanted to make clear that I didn't know your intentions and could not exactly help you bypass their website security. To answer your question though, it is not selenium that they are detecting. The rules detected suspicious behavior and decided to take the appropriate measures against the offending client. They catch you by what you are not doing more than by what you are doing. In the repo link, you can checkout this file to get an idea base_rules/modsecurity_crs_20_protocol_violations.conf
                – Bassel Samman
                Oct 28 '15 at 1:29















              up vote
              4
              down vote













              It sounds like they are behind a web application firewall. Take a look at modsecurity and owasp to see how those work. In reality, what you are asking is how to do bot detection evasion. That is not what selenium web driver is for. It is for testing your web application not hitting other web applications. It is possible, but basically, you'd have to look at what a WAF looks for in their rule set and specifically avoid it with selenium if you can. Even then, it might still not work because you don't know what WAF they are using. You did the right first step, that is faking the user agent. If that didn't work though, then a WAF is in place and you probably need to get more tricky.



              Edit:
              Point taken from other answer. Make sure your user agent is actually being set correctly first. Maybe have it hit a local web server or sniff the traffic going out.






              share|improve this answer























              • I think you are on the correct path. I tested with my setup and replaced the User Agent with a valid user agent string that successfully went through and received the same result, stubhub blocked the request.
                – Brian Cain
                Oct 23 '15 at 23:36










              • Okay, if the user agent is fine, then they have attack detection in place for sure. WAF is a good place to start. Not that I'm condoning hitting other websites. I'm just answering in the name of science and advancement of human knowledge.
                – Bassel Samman
                Oct 23 '15 at 23:49






              • 1




                If it was an HTTP header issue then wouldn't the normal browser get blocked? The HTTP headers are exactly the same. Also what exactly am I looking at with that github link? Have you tried using selenium to go on stubhub? Something is very very off.
                – Ryan Weinstein
                Oct 26 '15 at 21:15






              • 1




                I'm sorry for the confusion. I'll look into that and you don't have to help me anymore if you don't want to. Most of my experience is in programming systems applications, so I was not familiar with these modsecurity rules that you're talking about. I'll take a look and try to educate myself. I'm not trying to bypass anything, I was just interested in knowing how these websites detect a user using selenium.
                – Ryan Weinstein
                Oct 27 '15 at 18:49






              • 1




                I'm a developer too :). Learning is a cause I can get behind. I don't mind helping, I just wanted to make clear that I didn't know your intentions and could not exactly help you bypass their website security. To answer your question though, it is not selenium that they are detecting. The rules detected suspicious behavior and decided to take the appropriate measures against the offending client. They catch you by what you are not doing more than by what you are doing. In the repo link, you can checkout this file to get an idea base_rules/modsecurity_crs_20_protocol_violations.conf
                – Bassel Samman
                Oct 28 '15 at 1:29













              up vote
              4
              down vote










              up vote
              4
              down vote









              It sounds like they are behind a web application firewall. Take a look at modsecurity and owasp to see how those work. In reality, what you are asking is how to do bot detection evasion. That is not what selenium web driver is for. It is for testing your web application not hitting other web applications. It is possible, but basically, you'd have to look at what a WAF looks for in their rule set and specifically avoid it with selenium if you can. Even then, it might still not work because you don't know what WAF they are using. You did the right first step, that is faking the user agent. If that didn't work though, then a WAF is in place and you probably need to get more tricky.



              Edit:
              Point taken from other answer. Make sure your user agent is actually being set correctly first. Maybe have it hit a local web server or sniff the traffic going out.






              share|improve this answer














              It sounds like they are behind a web application firewall. Take a look at modsecurity and owasp to see how those work. In reality, what you are asking is how to do bot detection evasion. That is not what selenium web driver is for. It is for testing your web application not hitting other web applications. It is possible, but basically, you'd have to look at what a WAF looks for in their rule set and specifically avoid it with selenium if you can. Even then, it might still not work because you don't know what WAF they are using. You did the right first step, that is faking the user agent. If that didn't work though, then a WAF is in place and you probably need to get more tricky.



              Edit:
              Point taken from other answer. Make sure your user agent is actually being set correctly first. Maybe have it hit a local web server or sniff the traffic going out.







              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Oct 23 '15 at 23:53

























              answered Oct 23 '15 at 23:28









              Bassel Samman

              736311




              736311












              • I think you are on the correct path. I tested with my setup and replaced the User Agent with a valid user agent string that successfully went through and received the same result, stubhub blocked the request.
                – Brian Cain
                Oct 23 '15 at 23:36










              • Okay, if the user agent is fine, then they have attack detection in place for sure. WAF is a good place to start. Not that I'm condoning hitting other websites. I'm just answering in the name of science and advancement of human knowledge.
                – Bassel Samman
                Oct 23 '15 at 23:49






              • 1




                If it was an HTTP header issue then wouldn't the normal browser get blocked? The HTTP headers are exactly the same. Also what exactly am I looking at with that github link? Have you tried using selenium to go on stubhub? Something is very very off.
                – Ryan Weinstein
                Oct 26 '15 at 21:15






              • 1




                I'm sorry for the confusion. I'll look into that and you don't have to help me anymore if you don't want to. Most of my experience is in programming systems applications, so I was not familiar with these modsecurity rules that you're talking about. I'll take a look and try to educate myself. I'm not trying to bypass anything, I was just interested in knowing how these websites detect a user using selenium.
                – Ryan Weinstein
                Oct 27 '15 at 18:49






              • 1




                I'm a developer too :). Learning is a cause I can get behind. I don't mind helping, I just wanted to make clear that I didn't know your intentions and could not exactly help you bypass their website security. To answer your question though, it is not selenium that they are detecting. The rules detected suspicious behavior and decided to take the appropriate measures against the offending client. They catch you by what you are not doing more than by what you are doing. In the repo link, you can checkout this file to get an idea base_rules/modsecurity_crs_20_protocol_violations.conf
                – Bassel Samman
                Oct 28 '15 at 1:29


















              • I think you are on the correct path. I tested with my setup and replaced the User Agent with a valid user agent string that successfully went through and received the same result, stubhub blocked the request.
                – Brian Cain
                Oct 23 '15 at 23:36










              • Okay, if the user agent is fine, then they have attack detection in place for sure. WAF is a good place to start. Not that I'm condoning hitting other websites. I'm just answering in the name of science and advancement of human knowledge.
                – Bassel Samman
                Oct 23 '15 at 23:49






              • 1




                If it was an HTTP header issue then wouldn't the normal browser get blocked? The HTTP headers are exactly the same. Also what exactly am I looking at with that github link? Have you tried using selenium to go on stubhub? Something is very very off.
                – Ryan Weinstein
                Oct 26 '15 at 21:15






              • 1




                I'm sorry for the confusion. I'll look into that and you don't have to help me anymore if you don't want to. Most of my experience is in programming systems applications, so I was not familiar with these modsecurity rules that you're talking about. I'll take a look and try to educate myself. I'm not trying to bypass anything, I was just interested in knowing how these websites detect a user using selenium.
                – Ryan Weinstein
                Oct 27 '15 at 18:49






              • 1




                I'm a developer too :). Learning is a cause I can get behind. I don't mind helping, I just wanted to make clear that I didn't know your intentions and could not exactly help you bypass their website security. To answer your question though, it is not selenium that they are detecting. The rules detected suspicious behavior and decided to take the appropriate measures against the offending client. They catch you by what you are not doing more than by what you are doing. In the repo link, you can checkout this file to get an idea base_rules/modsecurity_crs_20_protocol_violations.conf
                – Bassel Samman
                Oct 28 '15 at 1:29
















              I think you are on the correct path. I tested with my setup and replaced the User Agent with a valid user agent string that successfully went through and received the same result, stubhub blocked the request.
              – Brian Cain
              Oct 23 '15 at 23:36




              I think you are on the correct path. I tested with my setup and replaced the User Agent with a valid user agent string that successfully went through and received the same result, stubhub blocked the request.
              – Brian Cain
              Oct 23 '15 at 23:36












              Okay, if the user agent is fine, then they have attack detection in place for sure. WAF is a good place to start. Not that I'm condoning hitting other websites. I'm just answering in the name of science and advancement of human knowledge.
              – Bassel Samman
              Oct 23 '15 at 23:49




              Okay, if the user agent is fine, then they have attack detection in place for sure. WAF is a good place to start. Not that I'm condoning hitting other websites. I'm just answering in the name of science and advancement of human knowledge.
              – Bassel Samman
              Oct 23 '15 at 23:49




              1




              1




              If it was an HTTP header issue then wouldn't the normal browser get blocked? The HTTP headers are exactly the same. Also what exactly am I looking at with that github link? Have you tried using selenium to go on stubhub? Something is very very off.
              – Ryan Weinstein
              Oct 26 '15 at 21:15




              If it was an HTTP header issue then wouldn't the normal browser get blocked? The HTTP headers are exactly the same. Also what exactly am I looking at with that github link? Have you tried using selenium to go on stubhub? Something is very very off.
              – Ryan Weinstein
              Oct 26 '15 at 21:15




              1




              1




              I'm sorry for the confusion. I'll look into that and you don't have to help me anymore if you don't want to. Most of my experience is in programming systems applications, so I was not familiar with these modsecurity rules that you're talking about. I'll take a look and try to educate myself. I'm not trying to bypass anything, I was just interested in knowing how these websites detect a user using selenium.
              – Ryan Weinstein
              Oct 27 '15 at 18:49




              I'm sorry for the confusion. I'll look into that and you don't have to help me anymore if you don't want to. Most of my experience is in programming systems applications, so I was not familiar with these modsecurity rules that you're talking about. I'll take a look and try to educate myself. I'm not trying to bypass anything, I was just interested in knowing how these websites detect a user using selenium.
              – Ryan Weinstein
              Oct 27 '15 at 18:49




              1




              1




              I'm a developer too :). Learning is a cause I can get behind. I don't mind helping, I just wanted to make clear that I didn't know your intentions and could not exactly help you bypass their website security. To answer your question though, it is not selenium that they are detecting. The rules detected suspicious behavior and decided to take the appropriate measures against the offending client. They catch you by what you are not doing more than by what you are doing. In the repo link, you can checkout this file to get an idea base_rules/modsecurity_crs_20_protocol_violations.conf
              – Bassel Samman
              Oct 28 '15 at 1:29




              I'm a developer too :). Learning is a cause I can get behind. I don't mind helping, I just wanted to make clear that I didn't know your intentions and could not exactly help you bypass their website security. To answer your question though, it is not selenium that they are detecting. The rules detected suspicious behavior and decided to take the appropriate measures against the offending client. They catch you by what you are not doing more than by what you are doing. In the repo link, you can checkout this file to get an idea base_rules/modsecurity_crs_20_protocol_violations.conf
              – Bassel Samman
              Oct 28 '15 at 1:29










              up vote
              4
              down vote













              Firefox is said to set window.navigator.webdriver === true if working with a webdriver. That was according to one of the older specs (e.g.: archive.org) but I couldn't find it in the new one except for some very vague wording in the appendices.



              A test for it is in the selenium code in the file fingerprint_test.js where the comment at the end says "Currently only implemented in firefox" but I wasn't able to identify any code in that direction with some simple greping, neither in the current (41.0.2) Firefox release-tree nor in the Chromium-tree.



              I also found a comment for an older commit regarding fingerprinting in the firefox driver b82512999938 from January 2015. That code is still in the Selenium GIT-master downloaded yesterday at javascript/firefox-driver/extension/content/server.js with a comment linking to the slightly differently worded appendix in the current w3c webdriver spec.






              share|improve this answer

















              • 1




                I just tested webdriver with Firefox 55 and I can confirm this is not true. The variable window.navigator.webdriver is not defined.
                – speedplane
                Oct 2 '17 at 17:56















              up vote
              4
              down vote













              Firefox is said to set window.navigator.webdriver === true if working with a webdriver. That was according to one of the older specs (e.g.: archive.org) but I couldn't find it in the new one except for some very vague wording in the appendices.



              A test for it is in the selenium code in the file fingerprint_test.js where the comment at the end says "Currently only implemented in firefox" but I wasn't able to identify any code in that direction with some simple greping, neither in the current (41.0.2) Firefox release-tree nor in the Chromium-tree.



              I also found a comment for an older commit regarding fingerprinting in the firefox driver b82512999938 from January 2015. That code is still in the Selenium GIT-master downloaded yesterday at javascript/firefox-driver/extension/content/server.js with a comment linking to the slightly differently worded appendix in the current w3c webdriver spec.






              share|improve this answer

















              • 1




                I just tested webdriver with Firefox 55 and I can confirm this is not true. The variable window.navigator.webdriver is not defined.
                – speedplane
                Oct 2 '17 at 17:56













              up vote
              4
              down vote










              up vote
              4
              down vote









              Firefox is said to set window.navigator.webdriver === true if working with a webdriver. That was according to one of the older specs (e.g.: archive.org) but I couldn't find it in the new one except for some very vague wording in the appendices.



              A test for it is in the selenium code in the file fingerprint_test.js where the comment at the end says "Currently only implemented in firefox" but I wasn't able to identify any code in that direction with some simple greping, neither in the current (41.0.2) Firefox release-tree nor in the Chromium-tree.



              I also found a comment for an older commit regarding fingerprinting in the firefox driver b82512999938 from January 2015. That code is still in the Selenium GIT-master downloaded yesterday at javascript/firefox-driver/extension/content/server.js with a comment linking to the slightly differently worded appendix in the current w3c webdriver spec.






              share|improve this answer












              Firefox is said to set window.navigator.webdriver === true if working with a webdriver. That was according to one of the older specs (e.g.: archive.org) but I couldn't find it in the new one except for some very vague wording in the appendices.



              A test for it is in the selenium code in the file fingerprint_test.js where the comment at the end says "Currently only implemented in firefox" but I wasn't able to identify any code in that direction with some simple greping, neither in the current (41.0.2) Firefox release-tree nor in the Chromium-tree.



              I also found a comment for an older commit regarding fingerprinting in the firefox driver b82512999938 from January 2015. That code is still in the Selenium GIT-master downloaded yesterday at javascript/firefox-driver/extension/content/server.js with a comment linking to the slightly differently worded appendix in the current w3c webdriver spec.







              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Oct 27 '15 at 23:44









              deamentiaemundi

              4,6552717




              4,6552717








              • 1




                I just tested webdriver with Firefox 55 and I can confirm this is not true. The variable window.navigator.webdriver is not defined.
                – speedplane
                Oct 2 '17 at 17:56














              • 1




                I just tested webdriver with Firefox 55 and I can confirm this is not true. The variable window.navigator.webdriver is not defined.
                – speedplane
                Oct 2 '17 at 17:56








              1




              1




              I just tested webdriver with Firefox 55 and I can confirm this is not true. The variable window.navigator.webdriver is not defined.
              – speedplane
              Oct 2 '17 at 17:56




              I just tested webdriver with Firefox 55 and I can confirm this is not true. The variable window.navigator.webdriver is not defined.
              – speedplane
              Oct 2 '17 at 17:56










              up vote
              3
              down vote













              The bot detection I've seen seems more sophisticated or at least different than what I've read through in the answers below.



              EXPERIMENT 1:




              1. I open a browser and web page with Selenium from a Python console.

              2. The mouse is already at a specific location where I know a link will appear once the page loads. I never move the mouse.

              3. I press the left mouse button once (this is necessary to take focus from the console where Python is running to the browser).

              4. I press the left mouse button again (remember, cursor is above a given link).

              5. The link opens normally, as it should.


              EXPERIMENT 2:




              1. As before, I open a browser and the web page with Selenium from a Python console.


              2. This time around, instead of clicking with the mouse, I use Selenium (in the Python console) to click the same element with a random offset.


              3. The link doesn't open, but I am taken to a sign up page.



              IMPLICATIONS:




              • opening a web browser via Selenium doesn't preclude me from appearing human

              • moving the mouse like a human is not necessary to be classified as human

              • clicking something via Selenium with an offset still raises the alarm


              Seems mysterious, but I guess they can just determine whether an action originates from Selenium or not, while they don't care whether the browser itself was opened via Selenium or not. Or can they determine if the window has focus? Would be interesting to hear if anyone has any insights.






              share|improve this answer



















              • 1




                My belief is that Selenium injects something into the page via javascript to find and access elements. This injection is what I believe they are detecting.
                – zeusalmighty
                Oct 25 at 13:31















              up vote
              3
              down vote













              The bot detection I've seen seems more sophisticated or at least different than what I've read through in the answers below.



              EXPERIMENT 1:




              1. I open a browser and web page with Selenium from a Python console.

              2. The mouse is already at a specific location where I know a link will appear once the page loads. I never move the mouse.

              3. I press the left mouse button once (this is necessary to take focus from the console where Python is running to the browser).

              4. I press the left mouse button again (remember, cursor is above a given link).

              5. The link opens normally, as it should.


              EXPERIMENT 2:




              1. As before, I open a browser and the web page with Selenium from a Python console.


              2. This time around, instead of clicking with the mouse, I use Selenium (in the Python console) to click the same element with a random offset.


              3. The link doesn't open, but I am taken to a sign up page.



              IMPLICATIONS:




              • opening a web browser via Selenium doesn't preclude me from appearing human

              • moving the mouse like a human is not necessary to be classified as human

              • clicking something via Selenium with an offset still raises the alarm


              Seems mysterious, but I guess they can just determine whether an action originates from Selenium or not, while they don't care whether the browser itself was opened via Selenium or not. Or can they determine if the window has focus? Would be interesting to hear if anyone has any insights.






              share|improve this answer



















              • 1




                My belief is that Selenium injects something into the page via javascript to find and access elements. This injection is what I believe they are detecting.
                – zeusalmighty
                Oct 25 at 13:31













              up vote
              3
              down vote










              up vote
              3
              down vote









              The bot detection I've seen seems more sophisticated or at least different than what I've read through in the answers below.



              EXPERIMENT 1:




              1. I open a browser and web page with Selenium from a Python console.

              2. The mouse is already at a specific location where I know a link will appear once the page loads. I never move the mouse.

              3. I press the left mouse button once (this is necessary to take focus from the console where Python is running to the browser).

              4. I press the left mouse button again (remember, cursor is above a given link).

              5. The link opens normally, as it should.


              EXPERIMENT 2:




              1. As before, I open a browser and the web page with Selenium from a Python console.


              2. This time around, instead of clicking with the mouse, I use Selenium (in the Python console) to click the same element with a random offset.


              3. The link doesn't open, but I am taken to a sign up page.



              IMPLICATIONS:




              • opening a web browser via Selenium doesn't preclude me from appearing human

              • moving the mouse like a human is not necessary to be classified as human

              • clicking something via Selenium with an offset still raises the alarm


              Seems mysterious, but I guess they can just determine whether an action originates from Selenium or not, while they don't care whether the browser itself was opened via Selenium or not. Or can they determine if the window has focus? Would be interesting to hear if anyone has any insights.






              share|improve this answer














              The bot detection I've seen seems more sophisticated or at least different than what I've read through in the answers below.



              EXPERIMENT 1:




              1. I open a browser and web page with Selenium from a Python console.

              2. The mouse is already at a specific location where I know a link will appear once the page loads. I never move the mouse.

              3. I press the left mouse button once (this is necessary to take focus from the console where Python is running to the browser).

              4. I press the left mouse button again (remember, cursor is above a given link).

              5. The link opens normally, as it should.


              EXPERIMENT 2:




              1. As before, I open a browser and the web page with Selenium from a Python console.


              2. This time around, instead of clicking with the mouse, I use Selenium (in the Python console) to click the same element with a random offset.


              3. The link doesn't open, but I am taken to a sign up page.



              IMPLICATIONS:




              • opening a web browser via Selenium doesn't preclude me from appearing human

              • moving the mouse like a human is not necessary to be classified as human

              • clicking something via Selenium with an offset still raises the alarm


              Seems mysterious, but I guess they can just determine whether an action originates from Selenium or not, while they don't care whether the browser itself was opened via Selenium or not. Or can they determine if the window has focus? Would be interesting to hear if anyone has any insights.







              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Jul 20 at 11:24

























              answered Apr 11 at 18:41









              M3RS

              866517




              866517








              • 1




                My belief is that Selenium injects something into the page via javascript to find and access elements. This injection is what I believe they are detecting.
                – zeusalmighty
                Oct 25 at 13:31














              • 1




                My belief is that Selenium injects something into the page via javascript to find and access elements. This injection is what I believe they are detecting.
                – zeusalmighty
                Oct 25 at 13:31








              1




              1




              My belief is that Selenium injects something into the page via javascript to find and access elements. This injection is what I believe they are detecting.
              – zeusalmighty
              Oct 25 at 13:31




              My belief is that Selenium injects something into the page via javascript to find and access elements. This injection is what I believe they are detecting.
              – zeusalmighty
              Oct 25 at 13:31










              up vote
              2
              down vote













              Obfuscating JavaScripts result



              I have checked the chromedriver source code. That injects some javascript files to the browser.
              Every javascript file on this link is injected to the web pages:
              https://chromium.googlesource.com/chromium/src/+/master/chrome/test/chromedriver/js/



              So I used reverse engineering and obfuscated the js files by Hex editing. Now i was sure that no more javascript variable, function names and fixed strings were used to uncover selenium activity. But still some sites and reCaptcha detect selenium!

              Maybe they check the modifications that are caused by chromedriver js execution :)




              Edit 1:



              Chrome 'navigator' parameters modification



              I discovered there are some parameters in 'navigator' that briefly uncover using of chromedriver.
              These are the parameters:





              • "navigator.webdriver" On non-automated mode it is 'undefined'. On automated mode it's 'true'.


              • "navigator.plugins" On headless chrome has 0 length. So I added some fake elements to fool the plugin length checking process.

              • "navigator.languages" was set to default chrome value '["en-US", "en", "es"]' .


              So what i needed was a chrome extension to run javascript on the web pages. I made an extension with the js code provided in the article and used another article to add the zipped extension to my project. I have successfully changed the values; But still nothing changed!



              I didn't find other variables like these but it doesn't mean that they don't exist. Still reCaptcha detects chromedriver, So there should be more variables to change. The next step should be reverse engineering of the detector services that i don't want to do.



              Now I'm not sure does it worth to spend more time on this automation process or search for alternative methods!






              share|improve this answer



























                up vote
                2
                down vote













                Obfuscating JavaScripts result



                I have checked the chromedriver source code. That injects some javascript files to the browser.
                Every javascript file on this link is injected to the web pages:
                https://chromium.googlesource.com/chromium/src/+/master/chrome/test/chromedriver/js/



                So I used reverse engineering and obfuscated the js files by Hex editing. Now i was sure that no more javascript variable, function names and fixed strings were used to uncover selenium activity. But still some sites and reCaptcha detect selenium!

                Maybe they check the modifications that are caused by chromedriver js execution :)




                Edit 1:



                Chrome 'navigator' parameters modification



                I discovered there are some parameters in 'navigator' that briefly uncover using of chromedriver.
                These are the parameters:





                • "navigator.webdriver" On non-automated mode it is 'undefined'. On automated mode it's 'true'.


                • "navigator.plugins" On headless chrome has 0 length. So I added some fake elements to fool the plugin length checking process.

                • "navigator.languages" was set to default chrome value '["en-US", "en", "es"]' .


                So what i needed was a chrome extension to run javascript on the web pages. I made an extension with the js code provided in the article and used another article to add the zipped extension to my project. I have successfully changed the values; But still nothing changed!



                I didn't find other variables like these but it doesn't mean that they don't exist. Still reCaptcha detects chromedriver, So there should be more variables to change. The next step should be reverse engineering of the detector services that i don't want to do.



                Now I'm not sure does it worth to spend more time on this automation process or search for alternative methods!






                share|improve this answer

























                  up vote
                  2
                  down vote










                  up vote
                  2
                  down vote









                  Obfuscating JavaScripts result



                  I have checked the chromedriver source code. That injects some javascript files to the browser.
                  Every javascript file on this link is injected to the web pages:
                  https://chromium.googlesource.com/chromium/src/+/master/chrome/test/chromedriver/js/



                  So I used reverse engineering and obfuscated the js files by Hex editing. Now i was sure that no more javascript variable, function names and fixed strings were used to uncover selenium activity. But still some sites and reCaptcha detect selenium!

                  Maybe they check the modifications that are caused by chromedriver js execution :)




                  Edit 1:



                  Chrome 'navigator' parameters modification



                  I discovered there are some parameters in 'navigator' that briefly uncover using of chromedriver.
                  These are the parameters:





                  • "navigator.webdriver" On non-automated mode it is 'undefined'. On automated mode it's 'true'.


                  • "navigator.plugins" On headless chrome has 0 length. So I added some fake elements to fool the plugin length checking process.

                  • "navigator.languages" was set to default chrome value '["en-US", "en", "es"]' .


                  So what i needed was a chrome extension to run javascript on the web pages. I made an extension with the js code provided in the article and used another article to add the zipped extension to my project. I have successfully changed the values; But still nothing changed!



                  I didn't find other variables like these but it doesn't mean that they don't exist. Still reCaptcha detects chromedriver, So there should be more variables to change. The next step should be reverse engineering of the detector services that i don't want to do.



                  Now I'm not sure does it worth to spend more time on this automation process or search for alternative methods!






                  share|improve this answer














                  Obfuscating JavaScripts result



                  I have checked the chromedriver source code. That injects some javascript files to the browser.
                  Every javascript file on this link is injected to the web pages:
                  https://chromium.googlesource.com/chromium/src/+/master/chrome/test/chromedriver/js/



                  So I used reverse engineering and obfuscated the js files by Hex editing. Now i was sure that no more javascript variable, function names and fixed strings were used to uncover selenium activity. But still some sites and reCaptcha detect selenium!

                  Maybe they check the modifications that are caused by chromedriver js execution :)




                  Edit 1:



                  Chrome 'navigator' parameters modification



                  I discovered there are some parameters in 'navigator' that briefly uncover using of chromedriver.
                  These are the parameters:





                  • "navigator.webdriver" On non-automated mode it is 'undefined'. On automated mode it's 'true'.


                  • "navigator.plugins" On headless chrome has 0 length. So I added some fake elements to fool the plugin length checking process.

                  • "navigator.languages" was set to default chrome value '["en-US", "en", "es"]' .


                  So what i needed was a chrome extension to run javascript on the web pages. I made an extension with the js code provided in the article and used another article to add the zipped extension to my project. I have successfully changed the values; But still nothing changed!



                  I didn't find other variables like these but it doesn't mean that they don't exist. Still reCaptcha detects chromedriver, So there should be more variables to change. The next step should be reverse engineering of the detector services that i don't want to do.



                  Now I'm not sure does it worth to spend more time on this automation process or search for alternative methods!







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Dec 5 at 19:35

























                  answered Dec 5 at 12:56









                  ShayanKM

                  129210




                  129210






















                      up vote
                      1
                      down vote













                      Write an html page with the following code. You will see that in the DOM selenium applies a webdriver attribute in the outerHTML






                      <html>
                      <head>
                      <script type="text/javascript">
                      <!--
                      function showWindow(){
                      javascript:(alert(document.documentElement.outerHTML));
                      }
                      //-->
                      </script>
                      </head>
                      <body>
                      <form>
                      <input type="button" value="Show outerHTML" onclick="showWindow()">
                      </form>
                      </body>
                      </html>








                      share|improve this answer

















                      • 4




                        The attribute is added only in Firefox.
                        – Louis
                        Oct 28 '15 at 9:22






                      • 1




                        And it is possible to remove it from the selenium extension that controlls browser. It will work anyway.
                        – erm3nda
                        Jun 12 '17 at 23:53















                      up vote
                      1
                      down vote













                      Write an html page with the following code. You will see that in the DOM selenium applies a webdriver attribute in the outerHTML






                      <html>
                      <head>
                      <script type="text/javascript">
                      <!--
                      function showWindow(){
                      javascript:(alert(document.documentElement.outerHTML));
                      }
                      //-->
                      </script>
                      </head>
                      <body>
                      <form>
                      <input type="button" value="Show outerHTML" onclick="showWindow()">
                      </form>
                      </body>
                      </html>








                      share|improve this answer

















                      • 4




                        The attribute is added only in Firefox.
                        – Louis
                        Oct 28 '15 at 9:22






                      • 1




                        And it is possible to remove it from the selenium extension that controlls browser. It will work anyway.
                        – erm3nda
                        Jun 12 '17 at 23:53













                      up vote
                      1
                      down vote










                      up vote
                      1
                      down vote









                      Write an html page with the following code. You will see that in the DOM selenium applies a webdriver attribute in the outerHTML






                      <html>
                      <head>
                      <script type="text/javascript">
                      <!--
                      function showWindow(){
                      javascript:(alert(document.documentElement.outerHTML));
                      }
                      //-->
                      </script>
                      </head>
                      <body>
                      <form>
                      <input type="button" value="Show outerHTML" onclick="showWindow()">
                      </form>
                      </body>
                      </html>








                      share|improve this answer












                      Write an html page with the following code. You will see that in the DOM selenium applies a webdriver attribute in the outerHTML






                      <html>
                      <head>
                      <script type="text/javascript">
                      <!--
                      function showWindow(){
                      javascript:(alert(document.documentElement.outerHTML));
                      }
                      //-->
                      </script>
                      </head>
                      <body>
                      <form>
                      <input type="button" value="Show outerHTML" onclick="showWindow()">
                      </form>
                      </body>
                      </html>








                      <html>
                      <head>
                      <script type="text/javascript">
                      <!--
                      function showWindow(){
                      javascript:(alert(document.documentElement.outerHTML));
                      }
                      //-->
                      </script>
                      </head>
                      <body>
                      <form>
                      <input type="button" value="Show outerHTML" onclick="showWindow()">
                      </form>
                      </body>
                      </html>





                      <html>
                      <head>
                      <script type="text/javascript">
                      <!--
                      function showWindow(){
                      javascript:(alert(document.documentElement.outerHTML));
                      }
                      //-->
                      </script>
                      </head>
                      <body>
                      <form>
                      <input type="button" value="Show outerHTML" onclick="showWindow()">
                      </form>
                      </body>
                      </html>






                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Oct 28 '15 at 4:10









                      PC3TJ

                      716415




                      716415








                      • 4




                        The attribute is added only in Firefox.
                        – Louis
                        Oct 28 '15 at 9:22






                      • 1




                        And it is possible to remove it from the selenium extension that controlls browser. It will work anyway.
                        – erm3nda
                        Jun 12 '17 at 23:53














                      • 4




                        The attribute is added only in Firefox.
                        – Louis
                        Oct 28 '15 at 9:22






                      • 1




                        And it is possible to remove it from the selenium extension that controlls browser. It will work anyway.
                        – erm3nda
                        Jun 12 '17 at 23:53








                      4




                      4




                      The attribute is added only in Firefox.
                      – Louis
                      Oct 28 '15 at 9:22




                      The attribute is added only in Firefox.
                      – Louis
                      Oct 28 '15 at 9:22




                      1




                      1




                      And it is possible to remove it from the selenium extension that controlls browser. It will work anyway.
                      – erm3nda
                      Jun 12 '17 at 23:53




                      And it is possible to remove it from the selenium extension that controlls browser. It will work anyway.
                      – erm3nda
                      Jun 12 '17 at 23:53










                      up vote
                      1
                      down vote













                      Some sites are detecting this:



                      function d() {
                      try {
                      if (window.document.$cdc_asdjflasutopfhvcZLmcfl_.cache_)
                      return !0
                      } catch (e) {}

                      try {
                      //if (window.document.documentElement.getAttribute(decodeURIComponent("%77%65%62%64%72%69%76%65%72")))
                      if (window.document.documentElement.getAttribute("webdriver"))
                      return !0
                      } catch (e) {}

                      try {
                      //if (decodeURIComponent("%5F%53%65%6C%65%6E%69%75%6D%5F%49%44%45%5F%52%65%63%6F%72%64%65%72") in window)
                      if ("_Selenium_IDE_Recorder" in window)
                      return !0
                      } catch (e) {}

                      try {
                      //if (decodeURIComponent("%5F%5F%77%65%62%64%72%69%76%65%72%5F%73%63%72%69%70%74%5F%66%6E") in document)
                      if ("__webdriver_script_fn" in document)
                      return !0
                      } catch (e) {}





                      share|improve this answer





















                      • This doesn't work for Chrome and Firefox, selenium 3.5.0, ChromeDriver 2.31.488774, geckodriver 0.18.0
                        – jerrypy
                        Aug 29 '17 at 6:35















                      up vote
                      1
                      down vote













                      Some sites are detecting this:



                      function d() {
                      try {
                      if (window.document.$cdc_asdjflasutopfhvcZLmcfl_.cache_)
                      return !0
                      } catch (e) {}

                      try {
                      //if (window.document.documentElement.getAttribute(decodeURIComponent("%77%65%62%64%72%69%76%65%72")))
                      if (window.document.documentElement.getAttribute("webdriver"))
                      return !0
                      } catch (e) {}

                      try {
                      //if (decodeURIComponent("%5F%53%65%6C%65%6E%69%75%6D%5F%49%44%45%5F%52%65%63%6F%72%64%65%72") in window)
                      if ("_Selenium_IDE_Recorder" in window)
                      return !0
                      } catch (e) {}

                      try {
                      //if (decodeURIComponent("%5F%5F%77%65%62%64%72%69%76%65%72%5F%73%63%72%69%70%74%5F%66%6E") in document)
                      if ("__webdriver_script_fn" in document)
                      return !0
                      } catch (e) {}





                      share|improve this answer





















                      • This doesn't work for Chrome and Firefox, selenium 3.5.0, ChromeDriver 2.31.488774, geckodriver 0.18.0
                        – jerrypy
                        Aug 29 '17 at 6:35













                      up vote
                      1
                      down vote










                      up vote
                      1
                      down vote









                      Some sites are detecting this:



                      function d() {
                      try {
                      if (window.document.$cdc_asdjflasutopfhvcZLmcfl_.cache_)
                      return !0
                      } catch (e) {}

                      try {
                      //if (window.document.documentElement.getAttribute(decodeURIComponent("%77%65%62%64%72%69%76%65%72")))
                      if (window.document.documentElement.getAttribute("webdriver"))
                      return !0
                      } catch (e) {}

                      try {
                      //if (decodeURIComponent("%5F%53%65%6C%65%6E%69%75%6D%5F%49%44%45%5F%52%65%63%6F%72%64%65%72") in window)
                      if ("_Selenium_IDE_Recorder" in window)
                      return !0
                      } catch (e) {}

                      try {
                      //if (decodeURIComponent("%5F%5F%77%65%62%64%72%69%76%65%72%5F%73%63%72%69%70%74%5F%66%6E") in document)
                      if ("__webdriver_script_fn" in document)
                      return !0
                      } catch (e) {}





                      share|improve this answer












                      Some sites are detecting this:



                      function d() {
                      try {
                      if (window.document.$cdc_asdjflasutopfhvcZLmcfl_.cache_)
                      return !0
                      } catch (e) {}

                      try {
                      //if (window.document.documentElement.getAttribute(decodeURIComponent("%77%65%62%64%72%69%76%65%72")))
                      if (window.document.documentElement.getAttribute("webdriver"))
                      return !0
                      } catch (e) {}

                      try {
                      //if (decodeURIComponent("%5F%53%65%6C%65%6E%69%75%6D%5F%49%44%45%5F%52%65%63%6F%72%64%65%72") in window)
                      if ("_Selenium_IDE_Recorder" in window)
                      return !0
                      } catch (e) {}

                      try {
                      //if (decodeURIComponent("%5F%5F%77%65%62%64%72%69%76%65%72%5F%73%63%72%69%70%74%5F%66%6E") in document)
                      if ("__webdriver_script_fn" in document)
                      return !0
                      } catch (e) {}






                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Aug 22 '17 at 9:52









                      Néstor Lim

                      1155




                      1155












                      • This doesn't work for Chrome and Firefox, selenium 3.5.0, ChromeDriver 2.31.488774, geckodriver 0.18.0
                        – jerrypy
                        Aug 29 '17 at 6:35


















                      • This doesn't work for Chrome and Firefox, selenium 3.5.0, ChromeDriver 2.31.488774, geckodriver 0.18.0
                        – jerrypy
                        Aug 29 '17 at 6:35
















                      This doesn't work for Chrome and Firefox, selenium 3.5.0, ChromeDriver 2.31.488774, geckodriver 0.18.0
                      – jerrypy
                      Aug 29 '17 at 6:35




                      This doesn't work for Chrome and Firefox, selenium 3.5.0, ChromeDriver 2.31.488774, geckodriver 0.18.0
                      – jerrypy
                      Aug 29 '17 at 6:35










                      up vote
                      0
                      down vote













                      It seems to me the simplest way to do it with Selenium is to intercept the XHR that sends back the browser fingerprint.



                      But since this is a Selenium-only problem, its better just to use something else. Selenium is supposed to make things like this easier, not way harder.






                      share|improve this answer





















                      • What are other options to selenium?
                        – Tai
                        Dec 3 at 17:38










                      • And can they be detected as well?
                        – Tai
                        Dec 3 at 17:45










                      • I guess Requests would be the main python option. If you send the same exact requests that your browser sends, you will appear as a normal browser.
                        – pguardiario
                        Dec 3 at 23:14















                      up vote
                      0
                      down vote













                      It seems to me the simplest way to do it with Selenium is to intercept the XHR that sends back the browser fingerprint.



                      But since this is a Selenium-only problem, its better just to use something else. Selenium is supposed to make things like this easier, not way harder.






                      share|improve this answer





















                      • What are other options to selenium?
                        – Tai
                        Dec 3 at 17:38










                      • And can they be detected as well?
                        – Tai
                        Dec 3 at 17:45










                      • I guess Requests would be the main python option. If you send the same exact requests that your browser sends, you will appear as a normal browser.
                        – pguardiario
                        Dec 3 at 23:14













                      up vote
                      0
                      down vote










                      up vote
                      0
                      down vote









                      It seems to me the simplest way to do it with Selenium is to intercept the XHR that sends back the browser fingerprint.



                      But since this is a Selenium-only problem, its better just to use something else. Selenium is supposed to make things like this easier, not way harder.






                      share|improve this answer












                      It seems to me the simplest way to do it with Selenium is to intercept the XHR that sends back the browser fingerprint.



                      But since this is a Selenium-only problem, its better just to use something else. Selenium is supposed to make things like this easier, not way harder.







                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Dec 2 at 1:32









                      pguardiario

                      35.7k979112




                      35.7k979112












                      • What are other options to selenium?
                        – Tai
                        Dec 3 at 17:38










                      • And can they be detected as well?
                        – Tai
                        Dec 3 at 17:45










                      • I guess Requests would be the main python option. If you send the same exact requests that your browser sends, you will appear as a normal browser.
                        – pguardiario
                        Dec 3 at 23:14


















                      • What are other options to selenium?
                        – Tai
                        Dec 3 at 17:38










                      • And can they be detected as well?
                        – Tai
                        Dec 3 at 17:45










                      • I guess Requests would be the main python option. If you send the same exact requests that your browser sends, you will appear as a normal browser.
                        – pguardiario
                        Dec 3 at 23:14
















                      What are other options to selenium?
                      – Tai
                      Dec 3 at 17:38




                      What are other options to selenium?
                      – Tai
                      Dec 3 at 17:38












                      And can they be detected as well?
                      – Tai
                      Dec 3 at 17:45




                      And can they be detected as well?
                      – Tai
                      Dec 3 at 17:45












                      I guess Requests would be the main python option. If you send the same exact requests that your browser sends, you will appear as a normal browser.
                      – pguardiario
                      Dec 3 at 23:14




                      I guess Requests would be the main python option. If you send the same exact requests that your browser sends, you will appear as a normal browser.
                      – pguardiario
                      Dec 3 at 23:14





                      protected by Mark Rotteveel Nov 14 at 16:33



                      Thank you for your interest in this question.
                      Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



                      Would you like to answer one of these unanswered questions instead?



                      這個網誌中的熱門文章

                      Academy of Television Arts & Sciences

                      L'Équipe

                      1995 France bombings