Skip to content
Ram Viswanadha edited this page Nov 5, 2017 · 1 revision

Introduction

Recently, we had a an issue in one of our production environments which was causing the UI to be unusable for one of our main use cases. The issue was with the auto-complete stalling of a long time and not returning any data. We couldn't reproduce the exact issue in our test environment because the amount of data we store and process in in production environment is 100x more. To effectively reproduce the bug, I had to simulate the the situation with test. In this multi-part series, we shall dive into the awesome world of Lucene!

Simulation

The autocomplete code in the API uses PATRICIA (Practical Algorithm to Retrieve Information Coded in Alphanumeric) Trie implementation to store the data and return results. A PATRICIA trie, also known a Radix Trie, is a compressed trie in which any node that is an only child is merged with its parent node. First step was to recreate a version of autocomplete implementation to isolate the problem.

private class AutoCompleteCacheLoader extends CacheLoader<String, PatriciaTrie<String>> {

        @Override
        public PatriciaTrie<String> load(String field){
            return loadAll(field);
        }
        PatriciaTrie<String> loadAll(String field){
            return loadTrie(field);
        }
        PatriciaTrie<String> loadTrie(String field){
            PatriciaTrie<String> trie = new PatriciaTrie<String>();
            // load 10M items into the Trie 
            for(int i=0;i<=10000000;i++){
                switch(field.toLowerCase()){
                    case "ipaddress":
                        String ip = randomIpAddress();
                        trie.put(ip,ip );
                        break;
                    case "hostname":
                        String hostName = RandomStringUtils.randomAlphanumeric(10);
                        trie.put(hostName, hostName);
                        break;
                    case "username":
                        String userName = RandomStringUtils.randomAlphanumeric(10);
                        trie.put(userName, userName);
                        break;
                    case "macaddress":
                        String mac = randomMACAddress();
                        trie.put(mac, mac);
                        break;
                    default:
                        throw new IllegalArgumentException("Unknown field name: "+field);
                }
            }
            return trie;
        }
        private String randomIpAddress(){
            Random rand = new Random();
            return String.format("%d.%d.%d.%d", rand.nextInt(256), rand.nextInt(256), rand.nextInt(256), rand.nextInt(256));
        }
        private String randomMACAddress(){
            Random rand = new Random();
            byte[] macAddr = new byte[6];
            rand.nextBytes(macAddr);

            macAddr[0] = (byte)(macAddr[0] & (byte)254);  //zeroing last 2 bytes to make it unicast and locally adminstrated

            StringBuilder sb = new StringBuilder(18);
            for(byte b : macAddr){
                if(sb.length() > 0)
                    sb.append(":");

                sb.append(String.format("%02x", b));
            }
            return sb.toString();
        }

        @Override
        public ListenableFuture<PatriciaTrie<String>> reload(final String key, PatriciaTrie<String> prevGraph) {

            // asynchronous
            ListenableFutureTask<PatriciaTrie<String>> task = ListenableFutureTask.create(new Callable<PatriciaTrie<String>>() {
                @Override
                public PatriciaTrie<String> call() throws Exception {
                    return loadTrie(key);
                }
            });
            Executors.newSingleThreadExecutor().execute(task);
            return task;

        }
    }

Now, I need to create the test that will exercise the code and instrument to calculate the retrieval time. For example

    private List<String> getAllCopy(String incomingString, String fieldName, LoadingCache<String, PatriciaTrie<String>> cache)
            throws Exception{

        List<String> list = Lists.newArrayList();
        fieldName = fieldName.toLowerCase();
        PatriciaTrie<String> trie = cache.get(fieldName);

        Map<String, String> map = trie.prefixMap(incomingString.toLowerCase());
        list.addAll(map.values());
        return list;

    }
    @Test
    public void testTrieAllCopy() throws Exception{
        PatriciaTrie<String> trie = cache.get("ipAddress");
        assertNotNull(trie);
        assertNotEquals(0, trie.size());
        //prime the cache
        long start = System.nanoTime();
        Collection<String> all = getAllCopy("1", "ipAddress", cache);
        long stop = System.nanoTime();
        System.out.println("Time taken for fetching "+Integer.toString(all.size())+" records: "+Long.toString((stop-start)/(1000))+" micro secs (No Priming)");
    } 

Running the test reproduced the problem we were experiencing in production.

The code was not priming the trie before measuring the speed. Measuring the performance in this manner is usually considered a bad practice. So, added the code to measure the average time taken for fetching the data after priming.

    @Test
    public void testTrieAllCopy() throws Exception{
        PatriciaTrie<String> trie = cache.get("ipAddress");
        assertNotNull(trie);
        assertNotEquals(0, trie.size());
        //prime the cache
        long start = System.nanoTime();
        Collection<String> all = getAllCopy("1", "ipAddress", cache);
        long stop = System.nanoTime();
        System.out.println("Time taken for fetching "+Integer.toString(all.size())+" records: "+Long.toString((stop-start)/(1000))+" micro secs (No Priming)");

        start = System.nanoTime();
        for(int i=0; i<100; i++) {
            all = getAllCopy("1", "ipAddress", cache);
        }
        stop = System.nanoTime();
        assertNotEquals(0,all.size());
        System.out.println("Average Time taken for fetching "+Integer.toString(all.size())+" records: "+Long.toString((stop-start)/(1000*100))+" micro secs (Priming)");

    }

Running the test after priming and calculating the average time taken for retrieval yielded pretty identical results as shown below. So, there should be something in the code that is contributing to the issue.

Iteration Copy (micro secs) Copy (Priming) (micro secs)
1 1602699 1677258
2 1609670 1663935
3 1519785 1638693
4 1587678 1713244
5 1521048 1705601
6 1552100 1653090
7 1571336 1661922
8 1674482 1571964
9 2360497 1759636
10 1700609 1830021

Inspecting the code closer, I noticed a copy operation that might be responsible for the some of the time taken. What if the copy operation is eliminated? How would the performance look like?

    private Collection<String> getAllNoCopy(String incomingString, String fieldName, LoadingCache<String, PatriciaTrie<String>> cache)
            throws Exception{

        fieldName = fieldName.toLowerCase();
        PatriciaTrie<String> trie = cache.get(fieldName);

        Map<String, String> map = trie.prefixMap(incomingString.toLowerCase());

        return map.values();

    }
    @Test
    public void testTrieNoCopy() throws Exception{
        PatriciaTrie<String> trie = cache.get("ipAddress");
        assertNotNull(trie);
        assertNotEquals(0, trie.size());
        //prime the cache
        long start = System.nanoTime();
        Collection<String> all = getAllNoCopy("1", "ipAddress", cache);
        long stop = System.nanoTime();
        System.out.println("Time taken for fetching "+Integer.toString(all.size())+" records: "+Long.toString((stop-start)/(1000))+" micro secs (No Priming)");

        start = System.nanoTime();
        for(int i=0; i<100; i++) {
            all = getAllNoCopy("1", "ipAddress", cache);
        }
        stop = System.nanoTime();
        assertNotEquals(0,all.size());
        System.out.println("Average Time taken for fetching "+Integer.toString(all.size())+" records: "+Long.toString((stop-start)/(1000*100))+" micro secs (Priming)");

    }

Running the above test produced the following results.

Iteration Copy (micro secs) Copy (Priming) (micro secs) No Copy (micro secs) No Copy (Priming) (micro secs)
1 1602699 1677258 4622 13
2 1609670 1663935 3633 13
3 1519785 1638693 2611 13
4 1587678 1713244 5157 12
5 1521048 1705601 3287 12
6 1552100 1653090 1738 12
7 1571336 1661922 1618 13
8 1674482 1571964 3767 15
9 2360497 1759636 10579 14
10 1700609 1830021 16134 12
1669990.4 1687536.4 5314.6 12.9

Ok, the time taken for retrieval has been reduced from 1.7 secs to 500 milli seconds! A good start. But the nature of auto-complete is such that the first invocation of the method must return within 500 ms.

Furthermore, it turns out copying of the data into a different structure before returning, does contribute to performance in the first invocation. For every subsequent invocation the difference is quite noticeable - 500 milli secs vs 12 micro secs. It appears that PatriciaTrie is caching the data.

The other issue with this specific Patricia Trie implementation is that there is no way to fetch the top 10 results. So, we had to find another way. Enter Lucene!

Apache Lucene

Apache LuceneTM is a high-performance, full-featured text search engine library written entirely in Java. Lucene is the storage engine for full text search engines such as ElasticSearch and Solr.

Lucene, provides easy to use APIs to store and retrieve data for high performance search applications.

Building the Index

To build a Lucene index in memory all we have to do is:

  1. Create a Directory
  2. Open the IndexWriter
  3. Load the strings into the index.

That SIMPLE!

private long initLuceneIndex() throws Exception{
        Analyzer analyzer = new  StandardAnalyzer();
        // Store the index in memory:
        Directory directory = new RAMDirectory();
        IndexWriterConfig config = new IndexWriterConfig(analyzer);
        IndexWriter iwriter = new IndexWriter(directory, config);
        long numDocs = loadLuceneIndex(iwriter);
        System.out.println("Number of docs inserted: "+ numDocs);
        iwriter.close();
}

To search the Lucene index all we have to do is:

  1. Open the directory for reading using DirectoryReader.open
  2. Create an IndexSearcher
  3. Create a query
  4. Search the index

Can it get any more simpler? I don't think so.

private long searchIndex(){
        DirectoryReader ireader = DirectoryReader.open(directory);
        System.out.println("Number of docs read: "+ ireader.numDocs());
        IndexSearcher isearcher = new IndexSearcher(ireader);
        // create the query for searching the index
        // here we are search for all ipAddress starting with
        RegexpQuery regexpQuery = new RegexpQuery(new Term( "ipAddress", ".*"+"1"+".*"));
        long start = System.nanoTime();
        ScoreDoc[] hits = iSearcher.search(regexpQuery, 10).scoreDocs;
        for(ScoreDoc hit : hits){
            Document hitDoc = iSearcher.doc(hit.doc);
            System.out.println("Doc: "+ hitDoc.toString());
            assertTrue(hitDoc.get("ipAddress").contains("1"));
        }
}

Here are supporting methods for loading the Lucene index.

private long loadLuceneIndex(IndexWriter writer) throws Exception {
        long numDocs = 0;
        for(int i=0;i<=10000000;i++){
            String ip = randomIpAddress();
            String hostName = RandomStringUtils.randomAlphanumeric(10);
            String macAddress = randomMACAddress();
            String userName = RandomStringUtils.randomAlphanumeric(10);
            Document doc = new Document();
            doc.add(new Field("ipAddress", ip, StringField.TYPE_STORED));
            doc.add(new Field("hostName", hostName, StringField.TYPE_STORED));
            doc.add(new Field("macAddress", macAddress, StringField.TYPE_STORED));
            doc.add(new Field("userName", userName, StringField.TYPE_STORED));
            writer.addDocument(doc);
            numDocs++;
        }
        return numDocs;
    }
    private String randomIpAddress(){
        Random rand = new Random();
        return String.format("%d.%d.%d.%d", rand.nextInt(256), rand.nextInt(256), rand.nextInt(256), rand.nextInt(256));
    }
    private String randomMACAddress(){
        Random rand = new Random();
        byte[] macAddr = new byte[6];
        rand.nextBytes(macAddr);

        macAddr[0] = (byte)(macAddr[0] & (byte)254);  //zeroing last 2 bytes to make it unicast and locally adminstrated

        StringBuilder sb = new StringBuilder(18);
        for(byte b : macAddr){
            if(sb.length() > 0)
                sb.append(":");

            sb.append(String.format("%02x", b));
        }
        return sb.toString();
    }

So far so good, but the performance numbers are going what we are after. Lets run a performance test.

TODO add the performance numbers

Can we do better? Yes of course. In the next part of the blog series, we shall talk about ways to extract even better performance.

Clone this wiki locally