642. Design Search Autocomplete System

題目網址：https://leetcode.cn/problems/design-search-autocomplete-system/

題意：為搜尋引擎設計一個自動補全系統, 用戶會輸入一個句子(最少包含一個 char, 並以 '#' 結尾)。

給一 string list sentence 和一整數 times, 長度皆為 n, 其中 sentence[i] 為之前輸入的句子, 而 times[i] 為該句子輸入的相應次數。對於 '#' 以外的每個輸入 char, 返回前 3 個熱門句子, 這些句子的 prefix 和已經輸入的句子的部分相同。

以下是詳細規則：

一個句子的熱度定義為歷史上用戶輸入該句子的總次數。

返回前 3 個熱門句子需依照熱度由高到低排序。若有多條熱度相同的句子, 則按照 ASCII 碼的順序輸出(ASCII 碼越小越前面)。

如果滿足條件的句子個數 < 3, 則將它們全部輸出。

如果輸入特殊 char '#', 則代表句子結束了, 請返回 empty list []。

實現 AutocompleteSystem class：

AutocompleteSystem(String[] sentences, int[] times)：使用 sentence 和 times 進行初始化。

List<String> input(char c) : 代表用戶輸入了 char c

如果 c == '#', 則返回 empty list []。

返回前 3 個熱門句子, 這些句子的 prefix 和已經輸入的句子的部分相同。如果滿足條件的句子個數 < 3, 則將它們全部輸出。

char c 要麼為小寫字母, 要麼為 ' ' 或 '#'。

Solution：

想法：利用 Trie + hash table + DFS, 其中 hash table 用來記錄所有滿足以 prefix 為開頭的所有句子、以及對應的次數, 而 res 負責儲存滿足以 prefix 為開頭的所有句子, 然後 res 會根據 hash table 中紀錄的次數和 ASCII 碼進行排序, 然後輸出前 3 高的句子

值得注意的是使用者輸入的句子也要 insert 到 trie 中

e.g. ['i', ' ', 'a', '#', 'a', '#'], 當搜尋完 i a 的句子後, 要把使用者輸入的 'i a' 也 insert 到 trie 中。同理 a 也是

class TrieNode{
public:
    TrieNode* children[27] = {nullptr}; // 多一個紀錄 ' '
    int times = 0; // end node 紀錄該句子的次數
    string sentence; // end node 紀錄整個句子
};

class Trie{
public:
    TrieNode *root;

    Trie() : root(new TrieNode()) {}

    void insert(const string& sentence, const int times){
        TrieNode *node = root;

        for (const auto& ch : sentence) {
            const int i = (ch == ' ') ? 26 : ch - 'a';
            if (!node->children[i]) {
                node->children[i] = new TrieNode();
            }
            node = node->children[i];
        }

        node->times += times; // end node 紀錄 times
        node->sentence = sentence; // end node 紀錄整個句子
    }

    vector<string> getHotSentences(const string& prefix){
        vector<string> res;
        TrieNode *node = root;
        unordered_map<string, int> times; // 紀錄滿足以 prefix 開頭的所有句子之次數

        // node =  prefix 的 end node
        for (const auto& ch : prefix) {
            const int i = (ch == ' ') ? 26 : ch - 'a';
            if (!node->children[i]) {
                return {};
            }
            node = node->children[i];
        }

        // 以 node 為 root 做 dfs, 找到以 prefix 為開頭的所有 sentence
        dfs(node, res, times);

        // 所有 sentences 根據 times 和 ASCII 碼做排序
        sort(res.begin(), res.end(), [&times](const string& s1, const string& s2){
            if (times[s1] != times[s2]) {
                return times[s1] > times[s2];
            }
            return s1 < s2;
        });

        // 若 res.size() < 3, 則直接返回 res; 否則, 返回 res 前三個
        if (res.size() < 3) {
            return res;
        }

        return vector<string>(res.begin(), res.begin() + 3);
    }

    // 找出所有符合 prefix 的 sentence
    void dfs(const TrieNode* node, vector<string>& res, unordered_map<string, int>& times){
        if (!node) {
            return;
        }

        // 句子結尾, 在 hash table 中記錄, 等等 sorting 會用到
        // 找到不直接 return, 而是要繼續往下嘗試
        if (node->times != 0) {
            times[node->sentence] = node->times;
            res.emplace_back(node->sentence);
        }

        // 繼續往下嘗試
        for (int i = 0; i < 27; ++i) {
            dfs(node->children[i], res, times);
        }
    }
};

class AutocompleteSystem {
public:
    AutocompleteSystem(vector<string>& sentences, vector<int>& times) {
        // 把所有的 sentence insert 到 trie 中
        for (int i = 0; i < sentences.size(); i++) {
            trie.insert(sentences[i], times[i]);
        }
    }

    vector<string> input(char c) {
        if (c == '#') {
            trie.insert(prefix, 1); // 把 '#' 之前輸入的句子 insert 到 trie 中
            prefix.clear(); // 清空當前句子
            return {};
        }

        prefix.push_back(c);

        return trie.getHotSentences(prefix);
    }

private:
    Trie trie;
    string prefix;
};

time：
- AutocompleteSystem()：$O(n \cdot m)$, 其中 n 為句子的個數, m 為句子的平均長度
- input()：$O(s + p + k \cdot log(k))$, 其中 s 為當前已輸入的句子, p 為 trie 中搜尋的 node 數, $O(k \cdot log(k))$ 為對長度為 k 的 list 做 sorting
space：$O(n \cdot m)$ ➔ trie, 其中 n 為句子的個數, m 為句子的平均長度