05 August 2008
同源字符串检测
原字符串由小写字母组成,可以按下述步骤编码:
任意将其 分割 为由若干 非空 子字符串组成的一个 序列 。
任意选择序列中的一些元素(也可能不选择),然后将这些元素替换为元素各自的长度(作为一个数字型的字符串)。
重新 顺次连接 序列,得到编码后的字符串。
例如,编码 “abcdefghijklmnop” 的一种方法可以描述为:
将原字符串分割得到一个序列:[“ab”, “cdefghijklmn”, “o”, “p”] 。
选出其中第二个和第三个元素并分别替换为它们自身的长度。序列变为 [“ab”, “12”, “1”, “p”] 。
重新顺次连接序列中的元素,得到编码后的字符串:”ab121p” 。
给你两个编码后的字符串 s1 和 s2 ,由小写英文字母和数字 1-9 组成。如果存在能够同时编码得到 s1 和 s2 原字符串,返回 true ;否则,返回 false。
注意:生成的测试用例满足 s1 和 s2 中连续数字数不超过 3 。
示例 1:
1
2
3
4
5
6
7
8
9
10
11
输入:s1 = "internationalization", s2 = "i18n"
输出:true
解释:"internationalization" 可以作为原字符串
- "internationalization"
-> 分割: ["internationalization"]
-> 不替换任何元素
-> 连接: "internationalization",得到 s1
- "internationalization"
-> 分割: ["i", "nternationalizatio", "n"]
-> 替换: ["i", "18", "n"]
-> 连接: "i18n",得到 s2
一般遇到这类字符串通配类型的问题,都会想到类似于编辑距离的动态规划模型,即dp(i,j)dp(i,j)用来表示第一个字符串的前i个字符和第二个字符串的前j个字符能否匹配。这道题也不例外,简单概括来说,两个字符串缩写可以看成是一系列小写英文字母,其中夹杂着特殊通配符(即数字),但每个特殊通配符能通配的字母数量是有要求的,而且解释方法并不唯一(例如两位数字,既可以整体解释,也可以各自解释成一个通配符)。
这种模型带来的一个问题在于前i个字符和前j个字符匹配时,不一定能做到完全匹配,可能只能部分匹配,即其中一个字符串长一点,而且长出来的部分全都是可以通配字母的部分,这种情况可以理解为“最后一个通配符匹配了一部分”,除此以外实际上和编辑距离动态规划是一样的。我们将状态表示为dp(i, j, v)dp(i,j,v),其中v表示多余的未匹配的通配符长度(可能由多个通配符组成),正数表示第一个字符串有多余,负数表示第二个字符串有多余,则和编辑距离类似,各自考虑向dp(i - 1, j)dp(i−1,j),dp(i, j-1)dp(i,j−1),dp(i-1, j-1) dp(i−1,j−1) 转移即可,不过这里因为数字可能长度不止是1,所以需要考虑最后一个数字有多位的情况,大致上可以分为以下几种情况:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
class Solution {
// ref = https://leetcode.com/problems/check-if-an-original-string-exists-given-two-encoded-strings/discuss/1550342/Java-Clean-(DFS-// /
Boolean[][][] dp;
boolean dfs(int i, int j, int diff, String s1, String s2) {
if (i >= s1.length() && j >= s2.length() && diff == 0) return true;
// diff > 0, j > i
// diff < 0, i > j
// System.out.println(" == " + diff + " " + (diff+1000) + " " + dp[i][j][diff + 1000]);
if (dp[i][j][diff + 1000] != null) {
return dp[i][j][diff + 1000];
}
boolean res = false;
if (i < s1.length()) {
if (Character.isDigit(s1.charAt(i))) {
int value = 0, count = 0;
while (i + count < s1.length() && count < 3 && Character.isDigit(s1.charAt(i + count))) {
value = value * 10 + (s1.charAt(i + count) - '0');
count++;
if (dfs(i + count, j, diff - value, s1, s2)) res = true;
}
} else {
if (diff > 0) {
if (dfs(i + 1, j, diff - 1, s1, s2)) res = true;
} else if (diff == 0 && j < s2.length() && s1.charAt(i) == s2.charAt(j)) {
if (dfs(i + 1, j + 1, diff, s1, s2)) res = true;
}
}
}
if (j < s2.length()) {
if (Character.isDigit(s2.charAt(j))) {
int value = 0, count = 0;
while (j + count < s2.length() && count < 3 && Character.isDigit(s2.charAt(j + count))) {
value = value * 10 + (s2.charAt(j + count) - '0');
count++;
if (dfs(i, j + count, diff + value, s1, s2)) res = true;
}
} else {
if (diff < 0) {
if (dfs(i, j + 1, diff + 1, s1, s2)) res = true;
}
}
}
return dp[i][j][diff + 1000] = res;
}
public boolean possiblyEquals(String s1, String s2) {
int m = s1.length();
int n = s2.length();
dp = new Boolean[m + 1][n + 1][2001];
return dfs(0, 0, 0, s1, s2);
}
}
自顶向下
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
class Solution {
//112ms
public boolean possiblyEquals(String s1, String s2) {
return dfs(s1.toCharArray(), s2.toCharArray(), 0, 0, 0, new Boolean[s1.length() + 1][s2.length() + 1][2001]);
}
boolean dfs(char[] s1, char[] s2, int i, int j, int diff, Boolean[][][] dp) {
if (i == s1.length && j == s2.length) {
return diff == 0;
}
if (dp[i][j][diff + 1000] != null)
return dp[i][j][diff + 1000];
// if both i and j are at the same location and chars are same then simply increment both pointers
if (i < s1.length && j < s2.length && diff == 0 && s1[i] == s2[j]) {
if (dfs(s1, s2, i + 1, j + 1, diff, dp)) {
return dp[i][j][diff + 1000] = true;
}
}
// if s1[i] is literal and diff > 0 then increment i and decrement diff by 1
if (i < s1.length && !Character.isDigit(s1[i]) && diff > 0 && dfs(s1, s2, i + 1, j, diff - 1, dp)) {
return dp[i][j][diff + 1000] = true;
}
// if s2[j] is literal and diff < 0 then increment j and increment diff by 1
// as we are done with the current jth char
if (j < s2.length && !Character.isDigit(s2[j]) && diff < 0 && dfs(s1, s2, i, j + 1, diff + 1, dp)) {
return dp[i][j][diff + 1000] = true;
}
// wildcard matching in s1
// if s1 contains l123
// then need to check with val as 1 then val as 12 and val as 123
for (int k = i, val = 0; k < i + 4 && k < s1.length && Character.isDigit(s1[k]); k++) {
val = val * 10 + s1[k] - '0';
if (dfs(s1, s2, k + 1, j, diff - val, dp)) {
return dp[i][j][diff + 1000] = true;
}
}
// wildcard matching in s2
// if s2 contains l123
// then need to check with val as 1 then val as 12 and val as 123
for (int k = j, val = 0; k < j + 4 && k < s2.length && Character.isDigit(s2[k]); k++) {
val = val * 10 + s2[k] - '0';
if (dfs(s1, s2, i, k + 1, diff + val, dp)) {
return dp[i][j][diff + 1000] = true;
}
}
return dp[i][j][diff + 1000] = false;
}
}