Code, Compose, Bike, Brew

…though not necessarily in that order

Interacting With a reCAPTCHA in Selenium

| Comments

This one was a bit of a headscratcher for me so I thought I’d share it here in case it helps anyone else.

I’m building web application which allows users to register for membership in a local community group, it’s web facing so it needs some pretty decent bot prevention to stop us getting scam submissions. After looking at a few products I decided to use Google’s reCAPTCHA. The app is written in Groovy and based on Spring Boot so I’ve used mkopylec’s excellent recaptcha-spring-boot-starter project to integrate the recaptcha validation in to Spring.

Everything was coming together nicely until I tried to add the recaptcha submission to my Selenium tests, the issue I had arises from the fact that the reCAPTCHA is (very sensibly) loaded in to an iframe which is sourced from a different domain. I presume this is done to deliberately make it hard for bots to interact with the recaptcha as browsers won’t allow it due to same-origin policy, for the same reason it makes automated browser driven testing just that little bit harder to achieve.

However after a bit of thought I discovered that it was possible to get this working. I’m using Spock, Geb and Selenium in this project so will explain my implementation in terms of these technologies.

The first step is to define a Page class which models the checkbox. I initially tried to do this using the CSS selectors which would have been appropriate if I was accessing the iframe within the context o the parent page, however I quickly realised this wasn’t possible. Instead the trick is to refer to it using the appropriate CSS selectors as it exists inside the iframe:

import geb.Page

class CaptchaFrame extends Page {
  static content = {
    checkbox { $(".recaptcha-checkbox-checkmark") }

Once we’ve got this the next step is to identify which index the iframe has in the page being tested. Unfortunately we’re unable to load the iframe its name as Google recently removed it, so we have to refer to it using its 0 based index. In my case its the first (and only) iframe on the page hence we find it using index 0. In addition to this we need to tell Geb that we’re accessing our CaptchaFrame class within the context of the specified iframe, rather than trying to find it in the parent page. To do this we can use Geb’s withFrame method.

withFrame(0, CaptchaFrame) {

Now we can put the above in to a simple test to demonstrate it working. This is what it looks like demonstrated as a simple Spock test:

def "can click the reCAPTCHA and submit the signup form"() {
        to SignupPage // This is a Geb Page which models my signup form (not shown here).
        withFrame(0, CaptchaFrame) { }
        at SignupCompletePage

Hope that helps someone, happy testing!