Warm tip: This article is reproduced from stackoverflow.com, please click
elasticsearch resthighlevelclient

Why simple query by id could cause timeout exception?

发布于 2020-04-09 22:54:47

Occasionally, there are below exception in production environment,

2020-01-29 17:10:46.085 ERROR 2852 --- [o-8022-exec-258] c.c.p.common.dao.SearchDao               : Search person by id failed

java.net.SocketTimeoutException: 30,000 milliseconds timeout on connection http-outgoing-832 [ACTIVE]
        at org.elasticsearch.client.RestClient.extractAndWrapCause(RestClient.java:789) ~[elasticsearch-rest-client-7.1.1.jar!/:7.1.1]
        at org.elasticsearch.client.RestClient.performRequest(RestClient.java:225) ~[elasticsearch-rest-client-7.1.1.jar!/:7.1.1]
        at org.elasticsearch.client.RestClient.performRequest(RestClient.java:212) ~[elasticsearch-rest-client-7.1.1.jar!/:7.1.1]
        at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1433) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]
        at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1403) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]
        at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1373) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]
        at org.elasticsearch.client.RestHighLevelClient.get(RestHighLevelClient.java:699) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]

Caused by: java.net.SocketTimeoutException: 30,000 milliseconds timeout on connection http-outgoing-832 [ACTIVE]
        at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.timeout(HttpAsyncRequestExecutor.java:387) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:92) ~[httpasyncclient-4.1.4.jar!/:4.1.4]
        at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:39) ~[httpasyncclient-4.1.4.jar!/:4.1.4]
        at org.apache.http.impl.nio.reactor.AbstractIODispatch.timeout(AbstractIODispatch.java:175) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.reactor.BaseIOReactor.sessionTimedOut(BaseIOReactor.java:263) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.timeoutCheck(AbstractIOReactor.java:492) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.reactor.BaseIOReactor.validate(BaseIOReactor.java:213) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:280) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591) ~[httpcore-nio-4.4.11.jar!/:4.4.11]

but it's just a simple query, not a complicated query

 curl 'http://localhost:9201/person/_doc/30154410564?pretty'

and in this time, the load is very low enter image description here

So why could exist these timeout exceptions? and there are many search query but why only this simple query by id could cause this exception?

The person index is synchronized from Oracle DB, and there is a scheduled task, every 10 min it will sync changed person to person index, and if access person index in this time, it will cause 30,000 milliseconds timeout. So how to resolve it? And it seems access by Java client, it will exist this phenomenon, but access by curl in command line it dose not exist this phenomenon.

PS:

health status index               uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   person              jb3msRw5S9ixgXN5SLd6bw   1   0  140754205     19239587     19.8gb         19.8gb

and in this time there was index write for the person index enter image description here

RestClient Config:

private final RestHighLevelClient restHighLevelClient;
restHighLevelClient = new RestHighLevelClient(RestClient.builder(new HttpHost(host, port)));
Questioner
zhuguowei
Viewed
570
zhuguowei 2020-02-01 16:47

By calling hot_threads api

 curl 'http://localhost:9201/_nodes/hot_threads?pretty'

got below info:

   100.9% (504.4ms out of 500ms) cpu usage by thread 'elasticsearch[node-1][get][T#6]'
     8/10 snapshots sharing following 33 elements
       app//org.apache.lucene.index.SingletonSortedNumericDocValues.nextDoc(SingletonSortedNumericDocValues.java:53)
       app//org.apache.lucene.codecs.lucene80.IndexedDISI.writeBitSet(IndexedDISI.java:196)
       app//org.apache.lucene.codecs.lucene80.Lucene80DocValuesConsumer.writeValues(Lucene80DocValuesConsumer.java:214)
       app//org.apache.lucene.codecs.lucene80.Lucene80DocValuesConsumer.addNumericField(Lucene80DocValuesConsumer.java:111)
       app//org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.addNumericField(PerFieldDocValuesFormat.java:109)
       app//org.apache.lucene.index.ReadersAndUpdates.handleDVUpdates(ReadersAndUpdates.java:368)
       app//org.apache.lucene.index.ReadersAndUpdates.writeFieldUpdates(ReadersAndUpdates.java:570)
       app//org.apache.lucene.index.ReaderPool.writeAllDocValuesUpdates(ReaderPool.java:228)
       app//org.apache.lucene.index.IndexWriter.writeReaderPool(IndexWriter.java:3308)
       app//org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:520)
       app//org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:294)
       app//org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:269)
       app//org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:259)
       app//org.apache.lucene.index.FilterDirectoryReader.doOpenIfChanged(FilterDirectoryReader.java:112)
       app//org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:140)
       app//org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:156)
       app//org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:58)
       app//org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:176)
       app//org.apache.lucene.search.ReferenceManager.maybeRefreshBlocking(ReferenceManager.java:253)
       app//org.elasticsearch.index.engine.InternalEngine.refresh(InternalEngine.java:1548)
       app//org.elasticsearch.index.engine.InternalEngine.get(InternalEngine.java:652)
       app//org.elasticsearch.index.shard.IndexShard.get(IndexShard.java:916)
       app//org.elasticsearch.index.get.ShardGetService.innerGet(ShardGetService.java:169)
       app//org.elasticsearch.index.get.ShardGetService.get(ShardGetService.java:93)
       app//org.elasticsearch.index.get.ShardGetService.get(ShardGetService.java:84)
       app//org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:106)
       app//org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:45)

it seems when query person by id, the internal auto executes refresh, and from official document got

By default, the get API is realtime, and is not affected by the refresh rate of the index (when data will become visible for search). If a document has been updated but is not yet refreshed, the get API will issue a refresh call in-place to make the document visible. This will also make other documents changed since the last refresh visible. In order to disable realtime GET, one can set the realtime parameter to false.

Note: Every time access person detail page, it will update this person's viewCount.

So I disabled realtime explicitly

        GetRequest getRequest = new GetRequest(personIndex, id.toString());
        getRequest.realtime(false);

When did so, timeout problem is solved.