GuilinDev

Lc0139

05 August 2008

139 - Word Break

题目

Given a non-empty string s and a dictionary wordDict containing a list of non-empty words, determine if s can be segmented into a space-separated sequence of one or more dictionary words.

Note:

  • The same word in the dictionary may be reused multiple times in the segmentation.
  • You may assume the dictionary does not contain duplicate words.

Example 1:

1
2
3
Input: s = "leetcode", wordDict = ["leet", "code"]
Output: true
Explanation: Return true because "leetcode" can be segmented as "leet code".

Example 2:

1
2
3
4
Input: s = "applepenapple", wordDict = ["apple", "pen"]
Output: true
Explanation: Return true because "applepenapple" can be segmented as "apple pen apple".
             Note that you are allowed to reuse a dictionary word.

Example 3:

1
2
Input: s = "catsandog", wordDict = ["cats", "dog", "sand", "and", "cat"]
Output: false

分析

1) DP

再次复习一下动态规划的思考和思路,其实DP源自于一个很自然的想法,就拿这道题来说,假如需要判断”onetwothreefour”这一个字符串能不能满足条件,很自然的想法就是: 如果”onetwothree”这一段可以拆分,再加上four如果也可以,那就行了; 或者 如果”onetwothre”这一段可以拆分,再加上efour如果也可以,那也行了; 这其实已经抓住了动态规划的最核心的东西了,换成式子来表达,就是

1
2
dp["onetwothreefour"] = dp["onetwothree"这一段] && 判断一下"four"
dp["onetwothreefour"] = dp["onetwothre"这一段] && 判断一下"efour"

这道题就搞定了。

动态规划的基本操作:

  1. 定义dp数组 - dp[i] 表示 s 前 i 个字符组成的字符串 s[0..i-1] 是否能被拆分成若干个字典中出现的单词
  2. 初始化 - dp[0]=true 表示空串且合法
  3. 转化公式 - dp[i]=dp[j] && check(s[j..i−1]),其中check(s[j..i−1]) 表示子串 s[j..i-1]是否出现在字典中
  4. 搞定!

DP的思路雷同,但每道题的实现有差别,比如这道题的遍历顺序稍微有点窍门,就是:对于检查一个字符串是否出现在给定的字符串列表里一般可以考虑哈希表(hashmap或hashset)来快速判断,同时也可以做一些简单的剪枝,正着枚举和倒着枚举都可以。

2) Trie

代码

DP不剪枝,好理解,时间O(n^2),空间O(n)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
class Solution {
    public boolean wordBreak(String s, List<String> wordDict) {
        Set<String> wordSet = new HashSet<>(wordDict); // set的contains方法效率比list的contains方法效率高        
        int len = s.length();
        boolean[] dp = new boolean[len + 1]; // 多一个0个字符的字符串情况
        dp[0] = true; // 刚开始啥字符也没有,初始化为可以break
        
        for (int i = 1; i <= len; i++) {
            for (int j = i; j >= 0; j--) { //这里从0开始也可以,但反着查比较符合自然想法
                if (dp[j] && wordSet.contains(s.substring(j, i))) { // 从0到j(includsive)的位置和从j+1到i-1的位置为单词
                    dp[i] = true;
                    break; // 当前为true必须跳出
                }
            }
        }
        return dp[len];
    }
}

DP + 剪枝,当超过字典中最大字符串的长度时,就不再检查是否存在了,复杂度一样

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
class Solution {
    public boolean wordBreak(String s, List<String> wordDict) {
        Set<String> wordSet = new HashSet<>(wordDict); // set的contains方法效率比list的contains方法效率高        
        int len = s.length();
        boolean[] dp = new boolean[len + 1]; // 多一个0个字符的字符串情况
        dp[0] = true; // 初始化
        
        // 找到wordDict中最大长度的单词
        int maxWordLen = 0;
        for (String word : wordDict) {
            maxWordLen = Math.max(maxWordLen, word.length());
        }
        
        for (int i = 1; i <= len; i++) {
            /**
            第二个循环,这里j在i左边,从i位置开始,这里需要计算的是[i - maxlength, i]这个区间里有没有满足dp[j] && wordSet.contains(s.substring(j,i)条件的情况,
            这里j不能从0开始往右直到i,否则计算的是[0, i - maxlength]这个区间,和要求的是反着的。
            仔细思考一下:相当于j + maxWordLen >= i,所有从i开始向左比较好枚举
            */
            for (int j = i; j >= 0 && j >= i - maxWordLen; j--) { // j >= i - maxWordLen,dp[0]相当于占位,所以不用考虑+1 -1的问题
                if (dp[j] && wordSet.contains(s.substring(j, i))) {
                    dp[i] = true;
                    break;
                }
            }
        }
        return dp[len];
    }
}

BFS

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
public class Solution {
    public boolean wordBreak(String s, List<String> wordDict) {
        Set<String> wordDictSet = new HashSet<>(wordDict);
        Queue<Integer> queue = new LinkedList<>();
        boolean[] visited = new boolean[s.length()];
        queue.add(0);
        while (!queue.isEmpty()) {
            int start = queue.remove();
            if (visited[start]) {
                continue;
            }
            for (int end = start + 1; end <= s.length(); end++) {
                if (wordDictSet.contains(s.substring(start, end))) {
                    queue.add(end);
                    if (end == s.length()) {
                        return true;
                    }
                }
            }
            visited[start] = true;
        }
        return false;
    }
}

Trie + 记忆化,需要构建Trie,较麻烦,不要求

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
class Solution {
    class Trie {
        class Node {
            boolean present;
            HashMap<Character, Node> next;

            Node() {
                next = new HashMap<>();
            }
        }

        private Node root;

        Trie() {
            root = new Node();
        }

        public void insert(String word) {
            Node p = root;
            for (int i = 0; i < word.length(); i++) {
                char c = word.charAt(i);
                if (!p.next.containsKey(c)) {
                    p.next.put(c, new Node());
                }
                p = p.next.get(c);
                if (i == word.length()-1) {
                    p.present = true;
                }
            }
        }

        public boolean contains(String word) {
            Node p = root;
            for (int i = 0; i < word.length(); i++) {
                char c = word.charAt(i);
                if (!p.next.containsKey(c)) return false;
                p = p.next.get(c);
                if (i == word.length()-1) {
                    if (p.present) return true;
                }
            }
            return false;
        }

        public boolean startsWith(String prefix) {
            Node p = root;
            for (int i = 0; i < prefix.length(); i++) {
                char c = prefix.charAt(i);
                if (!p.next.containsKey(c)) return false;
                p = p.next.get(c);
            }
            return true;
        }
    }

    public boolean wordBreak(String s, List<String> wordDict) {
        Trie trie = new Trie();
        for (String word : wordDict) {
            trie.insert(word);
        }
        HashMap<String, Boolean> memo = new HashMap<>();
        return canBreak(s, trie, memo);
    }

    private boolean canBreak(String s, Trie trie, HashMap<String, Boolean> memo) {
        if (memo.containsKey(s)) return memo.get(s);
        for (int i = 1; i <= s.length(); i++) {
            if (trie.startsWith(s.substring(0, i))) {
                if (trie.contains(s.substring(0, i))) {
                    if (canBreak(s.substring(i, s.length()), trie, memo)) {
                        memo.put(s, true);
                        return true;
                    }
                }
                if (i == s.length()) {
                    if (!trie.contains(s)) {
                        memo.put(s, false);
                        return false;
                    }
                }
            }
            else {
                memo.put(s, false);
                return false;
            }
        }
        memo.put(s, true);
        return true;
    }
}