Warm tip: This article is reproduced from stackoverflow.com, please click
httpclient java jsoup optimization httpurlconnection

what makes Jsoup faster than HttpURLConnection & HttpClient in most cases

发布于 2020-04-07 10:17:10

I want to compare performances for the three implementations mentioned in the title, I wrote a little JAVA program to help me doing this. The main method contains three blocks of testing, each block looks like this :

        nb=0; time=0;
        for (int i = 0; i < 7; i++) {
            double v = methodX(url);
            if(v>0){
                nb++;
                time+=v;
            }
        }
        if(nb==0) nb=1;
        System.out.println("HttpClient : "+(time/ ((double) nb))+". Tries "+nb+"/7");

Variable nb is used to avoid failed requests. Now method methodX is one of :

    private static double testWithNativeHUC(String url){
        try {
            HttpURLConnection httpURLConnection= (HttpURLConnection) new URL(url).openConnection();
            httpURLConnection.addRequestProperty("User-Agent", UA);
            long before = System.currentTimeMillis();
            BufferedReader bufferedReader= new BufferedReader(new InputStreamReader(httpURLConnection.getInputStream()));
            while (bufferedReader.readLine()!=null);
            return System.currentTimeMillis()-before;
        } catch (IOException e) {
            e.printStackTrace();
            return -1;
        }
    }

    private static double testWithHC(String url) {
        try {
            CloseableHttpClient httpClient = HttpClientBuilder.create().setUserAgent(UA).build();
            BasicResponseHandler basicResponseHandler = new BasicResponseHandler();
            long before = System.currentTimeMillis();
            CloseableHttpResponse response = httpClient.execute(new HttpGet(url));
            basicResponseHandler.handleResponse(response);
            return System.currentTimeMillis() - before;
        } catch (IOException e) {
            e.printStackTrace();
            return -1;
        }
    }

    private static double testWithJsoup(String url){
        try{
            long before = System.currentTimeMillis();
            Jsoup.connect(url).execute().parse();
            return System.currentTimeMillis()-before;
        }catch (IOException e){
            e.printStackTrace();
            return -1;
        }
    }

What I am getting as output is the following.

for url https://stackoverflow.com :

    HttpUrlConnection : 325.85714285714283. Tries 7/7
    HttpClient : 299.0. Tries 7/7
    Jsoup : 172.42857142857142. Tries 7/7

for url https://online.vfsglobal.dz :

    HttpUrlConnection : 104.57142857142857. Tries 7/7
    HttpClient : 181.0. Tries 7/7
    Jsoup : 57.857142857142854. Tries 7/7

for url https://google.com/ :

    HttpUrlConnection : 251.28571428571428. Tries 7/7
    HttpClient : 259.57142857142856. Tries 7/7
    Jsoup : 299.85714285714283. Tries 7/7

for url https://algeria.blsspainvisa.com/book_appointment.php :

    HttpUrlConnection : 112.57142857142857. Tries 7/7
    HttpClient : 194.85714285714286. Tries 7/7
    Jsoup : 67.42857142857143. Tries 7/7

for url https://tunisia.blsspainvisa.com/book_appointment.php :

    HttpUrlConnection : 439.2857142857143. Tries 7/7
    HttpClient : 283.42857142857144. Tries 7/7
    Jsoup : 144.71428571428572. Tries 7/7

Even repeating tests gives same results, I didn't use a sleep time between requests to have rapid results, I believe it has no big impact on results.

EDIT In fact I analysed Jsoup's sources, it shows that it uses HttpURLConnection with BufferedInputStream, I've tried to use both in a HttpURLConnection fashion, but same results, as you can see, the difference is clear and Jsoup appears to be clearly faster than HttpURLConnection and it uses HttpURLConnection !

Thanks in advance,

Questioner
younes zeboudj
Viewed
153
Matthias 2020-02-05 20:35

Your Benchmark is not meaningful.

I wrote a microbenchmark for this three libraries and got as result, that there is no significant difference.

Benchmark                                     Mode  Cnt    Score   Error  Units
HttpBenchmark.httpClientGoogle                avgt    2  151.162          ms/op
HttpBenchmark.httpClientStackoverflow         avgt    2  151.086          ms/op
HttpBenchmark.httpUrlConnectionGoogle         avgt    2  235.869          ms/op
HttpBenchmark.httpUrlConnectionStackoverflow  avgt    2  145.162          ms/op
HttpBenchmark.jsoupGoogle                     avgt    2  391.162          ms/op
HttpBenchmark.jsoupStackoverflow              avgt    2  188.059          ms/op

There are only one small difference between your tests and mine:

  • JSoup set header "Accept-Encoding", "gzip" this will reduce bandwidth
  • JSoup uses an bigger buffer (32kb)
  • Reuse HttpClient is needed

In my tests JSoup is the slowest. Of course only JSoup parses the response.

My Benchmark:

@Warmup(iterations = 1, time = 3, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 2, time = 5, timeUnit = TimeUnit.SECONDS)
@Fork(1)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Benchmark)
@Threads(1)
public class HttpBenchmark {

    private static final String GOOGLE          = "https://google.com/";
    private static final String STACKOVERFLOW   = "https://stackoverflow.com";

    private final CloseableHttpClient httpClient = HttpClientBuilder.create().build();

    @Benchmark
    public void httpClientGoogle() throws Exception {
        httpClient(GOOGLE);
    }

    @Benchmark
    public void httpClientStackoverflow() throws Exception {
        httpClient(STACKOVERFLOW);
    }

    @Benchmark
    public void httpUrlConnectionGoogle() throws Exception {
        httpUrlConnection(GOOGLE);
    }

    @Benchmark
    public void httpUrlConnectionStackoverflow() throws Exception {
        httpUrlConnection(STACKOVERFLOW);
    }

    @Benchmark
    public void jsoupGoogle() throws Exception {
        jsoup(GOOGLE);
    }

    @Benchmark
    public void jsoupStackoverflow() throws Exception {
        jsoup(STACKOVERFLOW);
    }

    private void httpClient(final String url) throws Exception {
        final CloseableHttpResponse response = httpClient.execute(new HttpGet(url));
        final BasicResponseHandler basicResponseHandler = new BasicResponseHandler();
        basicResponseHandler.handleResponse(response);
        response.close();
    }

    private void httpUrlConnection(final String url) throws Exception {
        final HttpURLConnection httpURLConnection = (HttpURLConnection) new URL(url).openConnection();
        httpURLConnection.addRequestProperty("Accept-Encoding", "gzip");
        try (final BufferedInputStream r = new BufferedInputStream(httpURLConnection.getInputStream())) {
            final byte[] tmp = new byte[1024 * 32];
            int read;
            while (true) {
                read = r.read(tmp);
                if (read == -1) {
                    break;
                }
            }
        }
    }

    private void jsoup(final String url) throws Exception {
        Jsoup.connect(url).execute().parse();
    }

}