This is because Found attaches multiple IPs to your 0298347602938ahdf.us-east-1.aws.found.io hostname. So if you use a TransportClient with ssl on port 9343 and add the first IP you find with client.addTransportAddress(new InetSocketTransportAddress(host, port)), it’ll eventually stop working because it’s stuck with an old, invalid IP. The solution is to lookup all the IPs on the hostname and add them to the TransportClient, then do this every 1-5min (or something less than the DNS TTL). The TransportClient will check for duplicates and reachability, so you should have a stable system now. I wrote a gist for doing this with Spring and Groovy: https://gist.github.com/vaughnd/04350e4c5bf51dedabb8
Category Archives: Elasticsearch
When I index something in Elasticsearch, why doesn’t it show in my search straight after?
I got burnt by this little architectural nuance in Elasticsearch recently. While batch processing items in a content store, updating their status, then searching for more items, I kept getting stale data and didn’t understand why. It turned out that Elasticsearch is _near_ realtime, with a default 1s refresh interval. So if you index and query within a second, you’re going to see old data. The best way around this is to do a refresh on the index just before you access it to make sure you have the latest data.
Centralising Clojure/Java logging with Logback, LogStash, ElasticSearch, and Kibana
Checking logs when you have more than one servers is painful. Use Logback/Logstash-forwarder to send json-formatted logs to a central server running Logstash/ElasticSearch/Kibana, where you can then slice and dice logs to your heart’s content with the power of ElasticSearch and Kibana.
Confs and docs available here: https://github.com/vaughnd/centralised-logging