MaiDeveloper

Autocomplete / Suggestion / Typeahead Search using Trie Data Structure and Algorithm

What is it?

Autocomplete or suggestion or typeahead offers a number of possible values while typing. This usually applies to search bar and fill in form; such as city, state, country, etc.

Problem

In a survey form, participants need to type in their current state in a given text field. Given a list of 50 states. How can we increase the participants' typing speed and help them decrease the number of keystrokes to fill in the state? It must be case insenitive.

const states = [
  'Alabama', 'Alaska', 'Arizona', 'Arkansas', 'California', 'Colorado', 'Connecticut',
  'Delaware', 'Florida', 'Georgia', 'Hawaii', 'Idaho', 'Illinois', 'Indiana', 'Iowa', 'Kansas',
  'Kentucky', 'Louisiana', 'Maine', 'Maryland', 'Massachusetts', 'Michigan', 'Minnesota', 
  'Mississippi', 'Missouri', 'Montana', 'Nebraska', 'Nevada', 'New Hampshire', 'New Jersey', 
  'New Mexico', 'New York', 'North Carolina', 'North Dakota', 'Ohio', 'Oklahoma', 'Oregon',
  'Pennsylvania', 'Rhode Island', 'South Carolina', 'South Dakota', 'Tennessee', 'Texas', 'Utah', 
  'Vermont', 'Virginia', 'Washington', 'West Virginia', 'Wisconsin', 'Wyoming'
];

If the participant didn't type in anything, it should return:

[]

If the participant type in 'q', it should return:

[]

If the participant type in 'n', it should return:

['Nebraska', 'Nevada', 'New Hampshire', 'New Jersey', 'New Mexico', 'New York', 'North Carolina', 'North Dakota']

If the participant type in 'ne', it should return:

['Nebraska', 'Nevada', 'New Hampshire', 'New Jersey', 'New Mexico', 'New York']

If the participant type in 'new', it should return:

['New Hampshire', 'New Jersey', 'New Mexico', 'New York']

If the participant type in 'new y', it should return:

['New York']

Brute Force Solution

We are going to declare an empty array that holds all the possible states. Next, We iterate each state (50 states) and check if the state starts with the user's input. If so, add it to the array that we declare earlier and later return it.

class Autocomplete {
  constructor(list) {
    this.list = list;
  }

  /**
   * Check if the string starts with another string
   * @param {string} word 
   * @param {string} input 
   * @return {boolean} whether or not the word starts with input
   * @time complexity O(1)
   * @space complexity O(1)
   */
  static isStartsWith(word, input) {
    return word.toUpperCase().startsWith(input.toUpperCase());
  }

  /**
   * Get an array of possible values
   * @param {string}    input 
   * @return {string[]} an array of possible values
   * @time complexity O(M) where M is the size of list
   * @space complexity O(M) where M is the size of list
   */
  getAutocompleteList(input) {
    if (!input) {
      return [];
    }

    const possibleValues = [];

    for (let i = 0; i < this.list.length; i++) {
      if (Autocomplete.isStartsWith(this.list[i], input)) {
        possibleValues.push(this.list[i]);
      }
    }

    return possibleValues;
  }
}

Test Brute Force Solution

Let's test it using jest.

describe('Brute Force', () => {

  let autocomplete;

  beforeAll(() => {
    autocomplete = new Autocomplete(states);
  });

  test('should return an array', () => {
    const result = autocomplete.getAutocompleteList('');
    expect(result).toBeInstanceOf(Array);
  });

  test('should return an empty array when there is no match', () => {
    const result = autocomplete.getAutocompleteList('q');
    expect(result).toHaveLength(0);
  });

  test('should return a list of possible values', () => {
    const result = autocomplete.getAutocompleteList('New');
    const expected = [ 'New Hampshire', 'New Jersey',  'New Mexico', 'New York' ];
    expect(result).toEqual(expected);
  });

  test('should return an array containing exact one element', () => {
    const result = autocomplete.getAutocompleteList('New York');
    const expected = [ 'New York' ];
    expect(result).toEqual(expected);
  });

});
Brute Force
  ✓ should return an array (1 ms)
  ✓ should return an empty array when there is no match
  ✓ should return a list of possible values (1 ms)
  ✓ should return an array containing exact one element

Trie Data Structure Solution

What is trie?

In computer science, a trie, also called digital tree or prefix tree, is a kind of search tree—an ordered tree data structure used to store a dynamic set or associative array where the keys are usually strings.
Wikipedia

First of all, we construct a trie data structure using object. Next, we search for user's input from the root of the trie using depth of search (dfs) or breath of search (bfs) algorithm. In this example, we will implement both algorithms. If the search is found, then return the rest of the word in an array.

Let's get started with creating a trie data structure. A trie would look like this. The graph below illustrates a trie for Apple, App, Car, Cat, and Case.

govern-island-map
class TrieNode {
  constructor() {
    // CONTAINS ZERO OR MORE CHILDREN
    this.children = {};
    // INDICATE WHETHER THIS IS THE END OF A WORD
    this.last = false;
  }
}

class Autocomplete {
  constructor(list) {
    // INITIALIZE THE ROOT
    this.root = this.buildTrie(list);
  }

  /**
   * BUILD A TRIE DATA STRUCTURE
   * @param {string[]} list An array of words
   * @return {TrieNode}
   * @time complexity: O(N M) where N is the size of the list, M is the length of the words in the list
   * @space complexity: O(N M) where N is the size of the list, M is the length of the words in the list
   */
  buildTrie(list) {
    // THE ROOT OF THE TREE
    const root = new TrieNode();
    
    // FOR EACH WORD
    for (const word of list) {
      // ADD TO THE ROOT
      this.insert(root, word);
    }

    return root;
  }

  /**
   * INSERT A CHILD NODE TO THE PARENT NODE
   * A CHILD NODE IS CONSTRUCTED BASE ON EACH CHARACTER IN THE GIVEN WORD
   * @param {TrieNode} node 
   * @param {string} word
   * @time complexity: O(N) where N is the size of word
   * @space complexity: O(N) where N is the size of word
   */
  insert(parent, word) {
    // ITERATE EACH CHARACTER IN THE WORD
    for (const char of word) {
      // ADD A CHILD IF IT DOES NOT EXIST
      if (parent.children[char] === undefined) {
        parent.children[char] = new TrieNode();
      }
      // SET THE CURRENT CHILD AS PARENT
      parent = parent.children[char];
    }

    // SET TO TRUE BECAUSE THIS IS THE END OF THE WORD
    parent.last = true;
  }
}

When we create an instance of Autocomplete with a list of states, it will create a trie.

const autocomplete = new Autocomplete(states);

Let's write a test for it.

describe('Autocomplete using Trie', () => {

  test('should build a trie', () => {
    const words = [
      'apple',
      'cat',
      'car',
      'app',
    ];
    const autocomplete = new Autocomplete(words);
    const result = JSON.stringify(autocomplete.root);
    const expected = JSON.stringify({
      children: {
        a: {
          children: {
            p: {
              children: {
                p: {
                  children: {
                    l: {
                      children: {
                        e: {
                          children: {},
                          last: true,
                        },
                      },
                      last: false,
                    },
                  },
                  last: true,
                },
              },
              last: false,
            },
          },
          last: false,
        },
        c: {
          children: {
            a: {
              children: {
                t: {
                  children: {},
                  last: true,
                },
                r: {
                  children: {},
                  last: true,
                },
              },
              last: false,
            },
          },
          last: false,
        }
      },
      last: false,
    });

    expect(result).toEqual(expected);
  });

});

Autocomplete using Trie
  ✓ should build a trie (2 ms)

Next, we are going to write a function that check whether it starts with a given word.

/**
 * @typeof {object} StartsWithResult
 * @param {TrieNode}  node
 * @param {string}    word
 */

/**
 * CHECK IF THERE IS A TRIE STARTS WITH THE INPUT
 * @param {string} input 
 * @return {StartsWithResult}
 * @time complexity O(N) where N is the size of input
 * @space complexity O(N) where N is the size of input
 */
isStartsWith(input) {
  let node = this.root,
      found = true,
      word = '';

  for (const char of input) {
    // CHECK IF THE CHARACTER EXIST
    if (node.children[char]) {
      node = node.children[char];
      word += char;
    } else if (node.children[char.toUpperCase()]) { // CASE INSENITIVE
      node = node.children[char.toUpperCase()];
      word += char.toUpperCase();
    } else {
      found = false;
      break;
    }
  }

  // RETURNS THE CURRENT NODE AND WORD
  return {
    node: (found ? node : null),
    word,
  };
}
 
 

Depth First Search

Now, we have our trie data structure. Let's write a function to get a list of suggestion. There are two ways of traversing a trie; they are depth first search and breath first search.

Let's implement depth first search.

/**
 * GET AN ARRAY OF SUGGESTIONS
 * @param {string} input
 * @return {string[]}
 * @time complexity: O(N) where N is the size of input
 * @space complexity: O(N) where N is the size of input
 */
getAutocompleteListDfs(input) {
  // CHECK IF IT STARTS WITH INPUT
  const { node, word } = this.isStartsWith(input);

  // RETURN EMPTY ARRAY IF NO MATCH
  if (!node) {
    return [];
  }

  const words = [];

  // ADD TO THE RESULT IF THE CURRENT NODE ALSO THE LAST OF A WORD
  if (node.last) {
    words.push(word);
  }

  return words.concat(this.getWordsDepthFirstSearch(word, node));
}

/**
 * GET ALL THE WORDS FROM THE GIVEN NODE
 * USING DEPTH FIRST SEARCH ALGORITHM
 * @param {string} prefix
 * @param {TrieNode} node 
 * @return {string[]} An array of words
 * @time complexity: O(N) where N is the size of node
 * @space complexity: O(N) where N is the size of node
 */
getWordsDepthFirstSearch(prefix, node) {
  if (!node) {
    return;
  }

  let words = [];

  for (const char in node.children) {
    // IF THIS IS THE END OF THE WORD
    if (node.children[char].last) {
      // CONCATENATE THE PREVIOUS CHARACTERS AND THE CURRENT CHARACTER
      // ADD IT TO THE WORDS
      words.push(prefix + char);
    }

    words = [
      ...words,
      ...this.getWordsDepthFirstSearch(prefix + char, node.children[char]) // RECURSIVELY GET THE REST OF THE CHARACTERS
    ];
  }

  return words;
} 

Breath First Search

/**
 * GET AN ARRAY OF SUGGESTIONS
 * @param {string} input
 * @return {string[]}
 * @time complexity: O(N) where N is the size of input
 * @space complexity: O(N) where N is the size of input
 */
getAutocompleteListBfs(input) {
  // CHECK IF IT STARTS WITH INPUT
  const { node, word } = this.isStartsWith(input);

  // RETURN EMPTY ARRAY IF NO MATCH
  if (!node) {
    return [];
  }

  const words = [];

  // ADD TO THE RESULT IF THE CURRENT NODE ALSO THE LAST OF A WORD
  if (node.last) {
    words.push(word);
  }

  return words.concat(this.getWordsBreathFirstSearch(word, node));
}


/**
 * GET ALL THE WORDS FROM THE GIVEN NODE
 * USING BREATH FIRST SEARCH ALGORITHM
 * @param {string} prefix
 * @param {TrieNode} node 
 * @return {string[]} An array of words
 * @time complexity: O(N) where N is the size of node
 * @space complexity: O(N) where N is the size of node
 */
getWordsBreathFirstSearch(prefix, node) {
  const words = [];
  const queue = [
    { node, prefix }
  ];

  while (queue.length) {
      const { node, prefix } = queue.shift();

      for (const key in node.children) {
        // IF THIS IS THE END OF THE WORD
        if (node.children[key].last) {
          // CONCATENATE THE PREVIOUS CHARACTERS AND THE CURRENT CHARACTER
        // ADD IT TO THE WORDS
        words.push(prefix + key);
        }

        // ADD TO THE QUEUE
        queue.push({
          node: node.children[key],
          prefix: prefix + key,
        });
      }
  }

  return words;
}

Full Code

class TrieNode {
  constructor() {
    // CONTAINS ZERO OR MORE CHILDREN
    this.children = {};
    // INDICATE WHETHER THIS IS THE END OF A WORD
    this.last = false;
  }
}

class Autocomplete {
  constructor(list) {
    // INITIALIZE THE ROOT
    this.root = this.buildTrie(list);
  }

  /**
   * BUILD A TRIE DATA STRUCTURE
   * @param {string[]} list An array of words
   * @return {TrieNode}
   * @time complexity: O(N M) where N is the size of the list, M is the length of the words in the list
   * @space complexity: O(N M) where N is the size of the list, M is the length of the words in the list
   */
  buildTrie(list) {
    // THE ROOT OF THE TREE
    const root = new TrieNode();
    
    // FOR EACH WORD
    for (const word of list) {
      // ADD TO THE ROOT
      this.insert(root, word);
    }

    return root;
  }

  /**
   * INSERT A CHILD NODE TO THE PARENT NODE
   * A CHILD NODE IS CONSTRUCTED BASE ON EACH CHARACTER IN THE GIVEN WORD
   * @param {TrieNode} node 
   * @param {string} word
   * @time complexity: O(N) where N is the size of word
   * @space complexity: O(N) where N is the size of word
   */
  insert(parent, word) {
    // ITERATE EACH CHARACTER IN THE WORD
    for (const char of word) {
      // ADD A CHILD IF IT DOES NOT EXIST
      if (parent.children[char] === undefined) {
        parent.children[char] = new TrieNode();
      }
      // SET THE CURRENT CHILD AS PARENT
      parent = parent.children[char];
    }

    // SET TO TRUE BECAUSE THIS IS THE END OF THE WORD
    parent.last = true;
  }

  /**
   * @typeof {object} StartsWithResult
   * @param {TrieNode}  node
   * @param {string}    word
   */

  /**
   * CHECK IF THERE IS A TRIE STARTS WITH THE INPUT
   * @param {string} input 
   * @return {StartsWithResult}
   * @time complexity O(N) where N is the size of input
   * @space complexity O(N) where N is the size of input
   */
  isStartsWith(input) {
    let node = this.root,
        found = true,
        word = '';

    for (const char of input) {
      // CHECK IF THE CHARACTER EXIST
      if (node.children[char]) {
        node = node.children[char];
        word += char;
      } else if (node.children[char.toUpperCase()]) { // CASE INSENITIVE
        node = node.children[char.toUpperCase()];
        word += char.toUpperCase();
      } else {
        found = false;
        break;
      }
    }

    // RETURNS THE CURRENT NODE AND WORD
    return {
      node: (found ? node : null),
      word,
    };
  }

/**
 * GET AN ARRAY OF SUGGESTIONS
 * @param {string} input
 * @return {string[]}
 * @time complexity: O(N) where N is the size of input
 * @space complexity: O(N) where N is the size of input
 */
getAutocompleteListDfs(input) {
  // CHECK IF IT STARTS WITH INPUT
  const { node, word } = this.isStartsWith(input);

  // RETURN EMPTY ARRAY IF NO MATCH
  if (!node) {
    return [];
  }

  const words = [];

  // ADD TO THE RESULT IF THE CURRENT NODE ALSO THE LAST OF A WORD
  if (node.last) {
    words.push(word);
  }

  return words.concat(this.getWordsDepthFirstSearch(word, node));
}

/**
 * GET AN ARRAY OF SUGGESTIONS
 * @param {string} input
 * @return {string[]}
 * @time complexity: O(N) where N is the size of input
 * @space complexity: O(N) where N is the size of input
 */
getAutocompleteListBfs(input) {
  // CHECK IF IT STARTS WITH INPUT
  const { node, word } = this.isStartsWith(input);

  // RETURN EMPTY ARRAY IF NO MATCH
  if (!node) {
    return [];
  }

  const words = [];

  // ADD TO THE RESULT IF THE CURRENT NODE ALSO THE LAST OF A WORD
  if (node.last) {
    words.push(word);
  }

  return words.concat(this.getWordsBreathFirstSearch(word, node));
}

  /**
   * GET ALL THE WORDS FROM THE GIVEN NODE
   * USING DEPTH FIRST SEARCH ALGORITHM
   * @param {string} prefix
   * @param {TrieNode} node 
   * @return {string[]} An array of words
   * @time complexity: O(N) where N is the size of node
   * @space complexity: O(N) where N is the size of node
   */
  getWordsDepthFirstSearch(prefix, node) {
    if (!node) {
      return;
    }

    let words = [];

    for (const char in node.children) {
      // IF THIS IS THE END OF THE WORD
      if (node.children[char].last) {
        // CONCATENATE THE PREVIOUS CHARACTERS AND THE CURRENT CHARACTER
        // ADD IT TO THE WORDS
        words.push(prefix + char);
      }

      words = [
        ...words,
        ...this.getWordsDepthFirstSearch(prefix + char, node.children[char]) // RECURSIVELY GET THE REST OF THE CHARACTERS
      ];
    }

    return words;
  }

  /**
   * GET ALL THE WORDS FROM THE GIVEN NODE
   * USING BREATH FIRST SEARCH ALGORITHM
   * @param {string} prefix
   * @param {TrieNode} node 
   * @return {string[]} An array of words
   * @time complexity: O(N) where N is the size of node
   * @space complexity: O(N) where N is the size of node
   */
  getWordsBreathFirstSearch(prefix, node) {
    const words = [];
    const queue = [
      { node, prefix }
    ];

    while (queue.length) {
       const { node, prefix } = queue.shift();

       for (const key in node.children) {
         // IF THIS IS THE END OF THE WORD
         if (node.children[key].last) {
           // CONCATENATE THE PREVIOUS CHARACTERS AND THE CURRENT CHARACTER
          // ADD IT TO THE WORDS
          words.push(prefix + key);
         }

         // ADD TO THE QUEUE
         queue.push({
           node: node.children[key],
           prefix: prefix + key,
         });
       }
    }

    return words;
  }
}

What are the difference between depth first search and breath first search?

Obviously, the result come from these two algoritms are quite different. The list of suggestions returned from depth first search are in order as they pass in to the consturctor. Whereas, the result returned from breath first search are from shortest to longest words.

Test Trie Solution

const states = [
  'Alabama', 'Alaska', 'Arizona', 'Arkansas', 'California', 'Colorado', 'Connecticut',
  'Delaware', 'Florida', 'Georgia', 'Hawaii', 'Idaho', 'Illinois', 'Indiana', 'Iowa', 'Kansas',
  'Kentucky', 'Louisiana', 'Maine', 'Maryland', 'Massachusetts', 'Michigan', 'Minnesota', 
  'Mississippi', 'Missouri', 'Montana', 'Nebraska', 'Nevada', 'New Hampshire', 'New Jersey', 
  'New Mexico', 'New York', 'North Carolina', 'North Dakota', 'Ohio', 'Oklahoma', 'Oregon',
  'Pennsylvania', 'Rhode Island', 'South Carolina', 'South Dakota', 'Tennessee', 'Texas', 'Utah', 
  'Vermont', 'Virginia', 'Washington', 'West Virginia', 'Wisconsin', 'Wyoming'
];

describe('Autocomplete using Trie', () => {

  let autocomplete;

  beforeAll(() => {
    autocomplete = new Autocomplete(states);
  });

  test('should build a trie', () => {
    const words = [
      'apple',
      'cat',
      'car',
      'app',
    ];
    const autocomplete = new Autocomplete(words);
    const result = JSON.stringify(autocomplete.root);
    const expected = JSON.stringify({
      children: {
        a: {
          children: {
            p: {
              children: {
                p: {
                  children: {
                    l: {
                      children: {
                        e: {
                          children: {},
                          last: true,
                        },
                      },
                      last: false,
                    },
                  },
                  last: true,
                },
              },
              last: false,
            },
          },
          last: false,
        },
        c: {
          children: {
            a: {
              children: {
                t: {
                  children: {},
                  last: true,
                },
                r: {
                  children: {},
                  last: true,
                },
              },
              last: false,
            },
          },
          last: false,
        }
      },
      last: false,
    });

    expect(result).toEqual(expected);
  });

  describe('Depth First Search', () => {

    test('should return an array', () => {
      const result = autocomplete.getAutocompleteListDfs('');
      expect(result).toBeInstanceOf(Array);
    });
  
    test('should return an empty array when there is no match', () => {
      const result = autocomplete.getAutocompleteListDfs('q');
      expect(result).toHaveLength(0);
    });
  
    test('should return a list of possible values', () => {
      const result = autocomplete.getAutocompleteListDfs('New');
      const expected = [ 'New Hampshire', 'New Jersey',  'New Mexico', 'New York' ];
      expect(result).toEqual(expected);
    });
  
    test('should return an array containing exact one element', () => {
      const result = autocomplete.getAutocompleteListDfs('New York');
      const expected = [ 'New York' ];
      expect(result).toEqual(expected);
    });

  });

  describe('Breath First Search', () => {

    test('should return an array', () => {
      const result = autocomplete.getAutocompleteListBfs('');
      expect(result).toBeInstanceOf(Array);
    });
  
    test('should return an empty array when there is no match', () => {
      const result = autocomplete.getAutocompleteListBfs('q');
      expect(result).toHaveLength(0);
    });
  
    test('should return a list of possible values', () => {
      const result = autocomplete.getAutocompleteListBfs('New');
      const expected = [ 'New York', 'New Jersey',  'New Mexico', 'New Hampshire' ];
      expect(result).toEqual(expected);
    });
  
    test('should return an array containing exact one element', () => {
      const result = autocomplete.getAutocompleteListBfs('New York');
      const expected = [ 'New York' ];
      expect(result).toEqual(expected);
    });

  });


});

Autocomplete using Trie
  ✓ should build a trie (2 ms)
  Depth First Search
    ✓ should return an array (1 ms)
    ✓ should return an empty array when there is no match
    ✓ should return a list of possible values (1 ms)
    ✓ should return an array containing exact one element
  Breath First Search
    ✓ should return an array (1 ms)
    ✓ should return an empty array when there is no match
    ✓ should return a list of possible values
    ✓ should return an array containing exact one element

Conclusion

The trie solution performs better than brute force solution. This is because of their time complexity. In the survey form scenario, for every keystroke, the brute force solution will run 50 times (for each state); however, the trie solution will run the number of times based on the number of characters.



Mike Mai
Mike Mai   Brooklyn, New York
I am full-stack web developer, passionate about building world class web applications. Knowledge in designing, coding, testing, and debugging. I love to solve problems.