Vaughn Dickson

Easy async function execution in Python/Django using a queue and consumer Thread

I didn’t want to setup django-q2, Celery, or any of the other heavier background task running options for Django, since I just wanted to do basic API calls and save tracking data to the db, without blocking user requests. My async work also didn’t need to be transactional, so losing the queue due to app …

Easy async function execution in Python/Django using a queue and consumer Thread Read More »

How to work for an international software company from SA and maximise your earnings

I’m always astounded at how little South African companies pay for the same roles and experience as European companies do. The reality is that EU companies can generally afford to pay more than most SA companies, and the demand for talent is so high that developers are writing their own cheques. And often the EU …

How to work for an international software company from SA and maximise your earnings Read More »

How to communicate between your Chrome extension and your SPA web app

I needed a Chrome extension that could open my single-page application and send any text field to it, and after editing, send the changes back to the field. Sounds simple, but it led me down many dead ends and complex APIs. The first catch was that chrome.windows.create can take a callback which gets a window object, …

How to communicate between your Chrome extension and your SPA web app Read More »

I get NoNodeAvailableException on long-running processes using SSL TransportClient with Found.no

This is because Found attaches multiple IPs to your 0298347602938ahdf.us-east-1.aws.found.io hostname. So if you use a TransportClient with ssl on port 9343 and add the first IP you find with client.addTransportAddress(new InetSocketTransportAddress(host, port)), it’ll eventually stop working because it’s stuck with an old, invalid IP. The solution is to lookup all the IPs on the hostname …

I get NoNodeAvailableException on long-running processes using SSL TransportClient with Found.no Read More »

When I index something in Elasticsearch, why doesn’t it show in my search straight after?

I got burnt by this little architectural nuance in Elasticsearch recently. While batch processing items in a content store, updating their status, then searching for more items, I kept getting stale data and didn’t understand why. It turned out that Elasticsearch is _near_ realtime, with a default 1s refresh interval. So if you index and query within …

When I index something in Elasticsearch, why doesn’t it show in my search straight after? Read More »

Why does Apache Nutch sometimes get stuck using a single thread and crawling slowly?

Nutch generates a list of urls to fetch from the crawldb. In ./bin/crawl it defaults the size of the fetch list to sizeFetchList=50000. If you use the default setting generate.max.count=-1 which is unrestricted, you can potentially end up with 50000 urls from the same domain in your fetch list. Then the setting fetcher.queue.mode=byHost only creates …

Why does Apache Nutch sometimes get stuck using a single thread and crawling slowly? Read More »

Centralising Clojure/Java logging with Logback, LogStash, ElasticSearch, and Kibana

Checking logs when you have more than one servers is painful. Use Logback/Logstash-forwarder to send json-formatted logs to a central server running Logstash/ElasticSearch/Kibana, where you can then slice and dice logs to your heart’s content with the power of ElasticSearch and Kibana. Confs and docs available here: https://github.com/vaughnd/centralised-logging

Keybinding for emacs helm to recursively grep certain file extensions in your src directories

Helm for Emacs is a fantastic Quicksilver-like extension, but it gets quite wordy sometimes. Instead of C-u M-x helm-do-grep *nav to dir* *enter extensions* *enter query* to recursively grep, I defined the following in my init.el. Now hitting F1 will grep actual source across all my projects. (defun project-search () (interactive) (helm-do-grep-1 ‘(“/home/vaughn/src”) ‘(4) nil …

Keybinding for emacs helm to recursively grep certain file extensions in your src directories Read More »