For some code you may want to limit its execution time, to prevent infinite loops that can’t be detected within the code. I ran in to this issue with HTMLCleaner on certain HTML, which due to the architecture could not keep track of possible loops with a counter.
The code will need to handle thread interrupts internally by checking Thread.interrupted(), so it won’t always work on arbitrary code.
import java.util.concurrent.*;
class HTMLParseTask implements Callable<Document> {
String html
HTMLParseTask(String html) {
this.html = html;
}
@Override
Document call() throws Exception {
TagNode tagNode = cleaner.clean(html);
return domSerializer.createDOM(tagNode);
}
}
public Document clean(String html) {
if(html == null) return null;
// limit the html cleaning to 5s, to avoid any bad html causing infinite loops
ExecutorService executor = Executors.newSingleThreadExecutor();
Future<String> future = executor.submit(new HTMLParseTask(html));
Document result = null;
try {
result = future.get(5, TimeUnit.SECONDS);
} catch(TimeoutException ex) {
future.cancel(true); // cancel and send a thread interrupt
log.error("Error parsing HTML. Timed out");
} finally {
executor.shutdownNow();
}
return result;
}