Awesome Lucene

Introduction

Recently, we had a an issue in one of our production environments which was causing the UI to be unusable for one of our main use cases. The issue was with the auto-complete stalling of a long time and not returning any data. We couldn't reproduce the exact issue in our test environment because the amount of data we store and process in in production environment is 100x more. To effectively reproduce the bug, I had to simulate the the situation with test. In this multi-part series, we shall dive into the awesome world of Lucene!

Simulation

The autocomplete code in the API uses PATRICIA (Practical Algorithm to Retrieve Information Coded in Alphanumeric) Trie implementation to store the data and return results. A PATRICIA trie, also known a Radix Trie, is a compressed trie in which any node that is an only child is merged with its parent node. First step was to recreate a version of autocomplete implementation to isolate the problem.

private class AutoCompleteCacheLoader extends CacheLoader<String, PatriciaTrie<String>> {

        @Override
        public PatriciaTrie<String> load(String field){
            return loadAll(field);
        }
        PatriciaTrie<String> loadAll(String field){
            return loadTrie(field);
        }
        PatriciaTrie<String> loadTrie(String field){
            PatriciaTrie<String> trie = new PatriciaTrie<String>();
            // load 10M items into the Trie 
            for(int i=0;i<=10000000;i++){
                switch(field.toLowerCase()){
                    case "ipaddress":
                        String ip = randomIpAddress();
                        trie.put(ip,ip );
                        break;
                    case "hostname":
                        String hostName = RandomStringUtils.randomAlphanumeric(10);
                        trie.put(hostName, hostName);
                        break;
                    case "username":
                        String userName = RandomStringUtils.randomAlphanumeric(10);
                        trie.put(userName, userName);
                        break;
                    case "macaddress":
                        String mac = randomMACAddress();
                        trie.put(mac, mac);
                        break;
                    default:
                        throw new IllegalArgumentException("Unknown field name: "+field);
                }
            }
            return trie;
        }
        private String randomIpAddress(){
            Random rand = new Random();
            return String.format("%d.%d.%d.%d", rand.nextInt(256), rand.nextInt(256), rand.nextInt(256), rand.nextInt(256));
        }
        private String randomMACAddress(){
            Random rand = new Random();
            byte[] macAddr = new byte[6];
            rand.nextBytes(macAddr);

            macAddr[0] = (byte)(macAddr[0] & (byte)254);  //zeroing last 2 bytes to make it unicast and locally adminstrated

            StringBuilder sb = new StringBuilder(18);
            for(byte b : macAddr){
                if(sb.length() > 0)
                    sb.append(":");

                sb.append(String.format("%02x", b));
            }
            return sb.toString();
        }

        @Override
        public ListenableFuture<PatriciaTrie<String>> reload(final String key, PatriciaTrie<String> prevGraph) {

            // asynchronous
            ListenableFutureTask<PatriciaTrie<String>> task = ListenableFutureTask.create(new Callable<PatriciaTrie<String>>() {
                @Override
                public PatriciaTrie<String> call() throws Exception {
                    return loadTrie(key);
                }
            });
            Executors.newSingleThreadExecutor().execute(task);
            return task;

        }
    }

Now, I need to create the test that will exercise the code and instrument to calculate the retrieval time. For example

    private List<String> getAllCopy(String incomingString, String fieldName, LoadingCache<String, PatriciaTrie<String>> cache)
            throws Exception{

        List<String> list = Lists.newArrayList();
        fieldName = fieldName.toLowerCase();
        PatriciaTrie<String> trie = cache.get(fieldName);

        Map<String, String> map = trie.prefixMap(incomingString.toLowerCase());
        list.addAll(map.values());
        return list;

    }
    @Test
    public void testTrieAllCopy() throws Exception{
        PatriciaTrie<String> trie = cache.get("ipAddress");
        assertNotNull(trie);
        assertNotEquals(0, trie.size());
        //prime the cache
        long start = System.nanoTime();
        Collection<String> all = getAllCopy("1", "ipAddress", cache);
        long stop = System.nanoTime();
        System.out.println("Time taken for fetching "+Integer.toString(all.size())+" records: "+Long.toString((stop-start)/(1000))+" micro secs (No Priming)");
    }

Running the test reproduced the problem we were experiencing in production.

The code was not priming the trie before measuring the speed. Measuring the performance in this manner is usually considered a bad practice. So, added the code to measure the average time taken for fetching the data after priming.

    @Test
    public void testTrieAllCopy() throws Exception{
        PatriciaTrie<String> trie = cache.get("ipAddress");
        assertNotNull(trie);
        assertNotEquals(0, trie.size());
        //prime the cache
        long start = System.nanoTime();
        Collection<String> all = getAllCopy("1", "ipAddress", cache);
        long stop = System.nanoTime();
        System.out.println("Time taken for fetching "+Integer.toString(all.size())+" records: "+Long.toString((stop-start)/(1000))+" micro secs (No Priming)");

        start = System.nanoTime();
        for(int i=0; i<100; i++) {
            all = getAllCopy("1", "ipAddress", cache);
        }
        stop = System.nanoTime();
        assertNotEquals(0,all.size());
        System.out.println("Average Time taken for fetching "+Integer.toString(all.size())+" records: "+Long.toString((stop-start)/(1000*100))+" micro secs (Priming)");

    }

Running the test after priming and calculating the average time taken for retrieval yielded pretty identical results as shown below. So, there should be something in the code that is contributing to the issue.

Iteration	Copy (micro secs)	Copy (Priming) (micro secs)
1	1602699	1677258
2	1609670	1663935
3	1519785	1638693
4	1587678	1713244
5	1521048	1705601
6	1552100	1653090
7	1571336	1661922
8	1674482	1571964
9	2360497	1759636
10	1700609	1830021

Inspecting the code closer, I noticed a copy operation that might be responsible for the some of the time taken. What if the copy operation is eliminated? How would the performance look like?

    private Collection<String> getAllNoCopy(String incomingString, String fieldName, LoadingCache<String, PatriciaTrie<String>> cache)
            throws Exception{

        fieldName = fieldName.toLowerCase();
        PatriciaTrie<String> trie = cache.get(fieldName);

        Map<String, String> map = trie.prefixMap(incomingString.toLowerCase());

        return map.values();

    }
    @Test
    public void testTrieNoCopy() throws Exception{
        PatriciaTrie<String> trie = cache.get("ipAddress");
        assertNotNull(trie);
        assertNotEquals(0, trie.size());
        //prime the cache
        long start = System.nanoTime();
        Collection<String> all = getAllNoCopy("1", "ipAddress", cache);
        long stop = System.nanoTime();
        System.out.println("Time taken for fetching "+Integer.toString(all.size())+" records: "+Long.toString((stop-start)/(1000))+" micro secs (No Priming)");

        start = System.nanoTime();
        for(int i=0; i<100; i++) {
            all = getAllNoCopy("1", "ipAddress", cache);
        }
        stop = System.nanoTime();
        assertNotEquals(0,all.size());
        System.out.println("Average Time taken for fetching "+Integer.toString(all.size())+" records: "+Long.toString((stop-start)/(1000*100))+" micro secs (Priming)");

    }

Running the above test produced the following results.

Iteration	Copy (micro secs)	Copy (Priming) (micro secs)	No Copy (micro secs)	No Copy (Priming) (micro secs)
1	1602699	1677258	4622	13
2	1609670	1663935	3633	13
3	1519785	1638693	2611	13
4	1587678	1713244	5157	12
5	1521048	1705601	3287	12
6	1552100	1653090	1738	12
7	1571336	1661922	1618	13
8	1674482	1571964	3767	15
9	2360497	1759636	10579	14
10	1700609	1830021	16134	12
	1669990.4	1687536.4	5314.6	12.9

Ok, the time taken for retrieval has been reduced from 1.7 secs to 500 milli seconds! A good start. But the nature of auto-complete is such that the first invocation of the method must return within 500 ms.

Furthermore, it turns out copying of the data into a different structure before returning, does contribute to performance in the first invocation. For every subsequent invocation the difference is quite noticeable - 500 milli secs vs 12 micro secs. It appears that PatriciaTrie is caching the data.

The other issue with this specific Patricia Trie implementation is that there is no way to fetch the top 10 results. So, we had to find another way. Enter Lucene!

Apache Lucene

Apache Lucene^TM is a high-performance, full-featured text search engine library written entirely in Java. Lucene is the storage engine for full text search engines such as ElasticSearch and Solr.

Lucene, provides easy to use APIs to store and retrieve data for high performance search applications.

Building the Index

To build a Lucene index in memory all we have to do is:

Create a Directory
Open the IndexWriter
Load the strings into the index.

That SIMPLE!

private long initLuceneIndex() throws Exception{
        Analyzer analyzer = new  StandardAnalyzer();
        // Store the index in memory:
        Directory directory = new RAMDirectory();
        IndexWriterConfig config = new IndexWriterConfig(analyzer);
        IndexWriter iwriter = new IndexWriter(directory, config);
        long numDocs = loadLuceneIndex(iwriter);
        System.out.println("Number of docs inserted: "+ numDocs);
        iwriter.close();
}

To search the Lucene index all we have to do is:

Open the directory for reading using DirectoryReader.open
Create an IndexSearcher
Create a query
Search the index

Can it get any more simpler? I don't think so.

private long searchIndex(){
        DirectoryReader ireader = DirectoryReader.open(directory);
        System.out.println("Number of docs read: "+ ireader.numDocs());
        IndexSearcher isearcher = new IndexSearcher(ireader);
        // create the query for searching the index
        // here we are search for all ipAddress starting with
        RegexpQuery regexpQuery = new RegexpQuery(new Term( "ipAddress", ".*"+"1"+".*"));
        long start = System.nanoTime();
        ScoreDoc[] hits = iSearcher.search(regexpQuery, 10).scoreDocs;
        for(ScoreDoc hit : hits){
            Document hitDoc = iSearcher.doc(hit.doc);
            System.out.println("Doc: "+ hitDoc.toString());
            assertTrue(hitDoc.get("ipAddress").contains("1"));
        }
}

Here are supporting methods for loading the Lucene index.

private long loadLuceneIndex(IndexWriter writer) throws Exception {
        long numDocs = 0;
        for(int i=0;i<=10000000;i++){
            String ip = randomIpAddress();
            String hostName = RandomStringUtils.randomAlphanumeric(10);
            String macAddress = randomMACAddress();
            String userName = RandomStringUtils.randomAlphanumeric(10);
            Document doc = new Document();
            doc.add(new Field("ipAddress", ip, StringField.TYPE_STORED));
            doc.add(new Field("hostName", hostName, StringField.TYPE_STORED));
            doc.add(new Field("macAddress", macAddress, StringField.TYPE_STORED));
            doc.add(new Field("userName", userName, StringField.TYPE_STORED));
            writer.addDocument(doc);
            numDocs++;
        }
        return numDocs;
    }
    private String randomIpAddress(){
        Random rand = new Random();
        return String.format("%d.%d.%d.%d", rand.nextInt(256), rand.nextInt(256), rand.nextInt(256), rand.nextInt(256));
    }
    private String randomMACAddress(){
        Random rand = new Random();
        byte[] macAddr = new byte[6];
        rand.nextBytes(macAddr);

        macAddr[0] = (byte)(macAddr[0] & (byte)254);  //zeroing last 2 bytes to make it unicast and locally adminstrated

        StringBuilder sb = new StringBuilder(18);
        for(byte b : macAddr){
            if(sb.length() > 0)
                sb.append(":");

            sb.append(String.format("%02x", b));
        }
        return sb.toString();
    }

So far so good, but the performance numbers are going what we are after. Lets run a performance test.

TODO add the performance numbers

Can we do better? Yes of course. In the next part of the blog series, we shall talk about ways to extract even better performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Awesome Lucene

Introduction

Simulation

Apache Lucene

Building the Index

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally