java - Algorithm to truncate string and eliminate duplicates (case insensitive) -
i have set of strings meet following constraints
- case sensitive
- max character length of 10
i want convert these strings following new constraints valid (instead of previous constraints)
- case insensitive
- max character length of 5
suppose initial set of strings follows
city, city, city, city, city, city, city, city, city, city, city
i have partial algorithm maps these strings following
cit, cit1, cit2, cit3, cit4, cit5, cit6, cit7, cit8, cit9, cit10
this done using following logic
- consider first string common prefix
- count number of matches in rest of strings (case insensitive match). in current case 10
- determine number of characters required suffix. in current case since need generate suffices 1 through 10, need reserve 2 characters suffix
- truncate common prefix (max characters - number of characters suffix). in current case (5 - 2) i.e 3 characters
- generate strings concatenating truncated common prefix , suffix
using above able map old set of strings new set , satisfy new constraints.
however algorithm breaks if original set had string gets generated algorithm
suppose initial set of strings
city, cit1, cit2, city, city, city, city, city, city, city, city, city, city,
in case, since cit1 , cit2 exist in initial set, algorithm breaks (since generates duplicate cit1 , cit2)
is there way can recursively handle ?
i suggest follows:
for each input string, s if (result.contains(s)) result.add(s) else s = next(s) while (result.contains(s)) result.add(s);
where next(s)
defined as
split s [prefixpart, numberpart] num = numberpart == null ? 0 : numberpart+1 prefixlength = math.min(prefixpart.length, 5 - num.length) return prefixpart.substring(0, prefixlength) + num
i.e. next("city") = "city0"
, next("cit45") = "cit46"
Comments
Post a Comment