Skip to content Skip to sidebar Skip to footer

When Using Htmlunit, How Can I Configure The Underlying Nekohtml Parser?

I'm using HtmlUnit to try and scrape a webpage because of it's Javascript support. (I'd rather use Jsoup, but no JS support). The issue relates to a feature of the underlying Neko

Solution 1:

try initializing the web client with FF behavior

WebClientwebClient=newWebClient(BrowserVersion.FIREFOX_3_6);

and activate javascript

webClient.setJavaScriptEnabled(true);

it should be ok then.

Solution 2:

Solved...

    BrowserVersionFeatures[] bvf = newBrowserVersionFeatures[1];
    bvf[0] = BrowserVersionFeatures.HTMLIFRAME_IGNORE_SELFCLOSING;
    BrowserVersionbv=newBrowserVersion(
            BrowserVersion.NETSCAPE, "5.0 (Windows; en-US)",
            "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8",
            (float) 3.6, bvf);

    WebClientwebClient=newWebClient(bv);
    webClient.setJavaScriptEnabled(true);

Post a Comment for "When Using Htmlunit, How Can I Configure The Underlying Nekohtml Parser?"