Beautiful, unique strings

There may come a time in your career when you will have to solve the issue of finding the most beautiful and unique string from an input, or else!…OK, I admit the probability of this is very low, but if you are interested in solving a brain-teasing problem, here is you’re chance.

Problem:
1. String s is called unique if all the characters of s are different.
2. String s2 is producible from string s1, if we can remove some characters of s1 to obtain s2.
3. String s1 is more beautiful than string s2 if the length of s1 is greater than the length of s2 or, if they have equal length, if s1 is lexicographically greater than s2.
Given a string s, you have to find the most beautiful and unique string that is producible from s.

Solution:
Here are the main ideas behind the solution:

  1. Let the input and the output strings be represented as “I[]“ and “O[]“.
  2. For each character “c” from the input string I[], do the following validations:
    1. If c is not found in O[], append c to O[]
    2. If c already exists in O[]:
      1. Let “p” be the position of the character c in O[], and “ip” be the position of the character in I[]
      2. Find the first character in O[], placed after the position p, that is lexicographically greater than c
          1. If no such character is found,do nothing and continue with the next character validation from the input string
          2. If there is such character:
            1. Let’s remember it’s position in “gp”
            2. Take all the characters from O[], that are found between the positions p and gp
            3. If all of these characters are founded in I[] after the position ip, remove the character from O[] at position p and append the character at the end of O[]

Here is the implementation in C#: sources

Example:
I = “ccabdcab”
O = “”

  1. c = ‘c’
    Character not found in O
    O = “c”
  2. c = ‘c’
    Character already in O.
    p = 0;
    No greater character is found in O after p => do nothing.
    O = “c”
  3. c = ‘a’
    Character not found in O
    O = “ca”
  4. c = ‘b’
    Character not found in O.
    O = “cab”
  5. c = ‘d’
    Character not found in O.
    O = “cabd”
  6. c = ‘c’
    Character already in O.
    p = 0;
    ip = 5;
    A greater character (‘d’) is found in O at position gp = 3.
    All the characters between p and gp (‘a’,’b’) are found in I[] after the position ip = 5. This means that these characters will be validated and moved later in the process, and therefore the character from the position p can be safely removed from O.
    O =”abdc”
  7. c = ‘a’
    Character already in O.
    p=0;
    ip=6
    p = 0;
    A greater character (‘b’) is found in O at position gp = 1. Since there is no smaller character between ‘a’ and ‘b’ in the output string, the character ‘a’ can be safely removed from the position p=0, and appended at the end of O.
    O = “bdca”
  8. c = ‘b’
    Character already in O.
    p = 0;
    A greater character (‘d’) is found in O at position gp = 1. Since there is no smaller character between ‘b’ and ‘d’ in the output string, the character ‘b’ can be safely removed from the position p=0, and appended at the end of O.
    O = “dcab”

And since everybody is going bananas associating the following two examples with this problem, here is the generated output in those cases, following this solution:

Input: babab 
Output: ba

Input: nlhthgrfdnnlprjtecpdrthigjoqdejsfkasoctjijaoebqlrgaiakfsbljmpibkidjsrtkgrdnqsknbarpabgokbsrfhmeklrle 
Output: tsocrpkijgdqnbafhmle

About these ads

2 comments

    • Rasemanju

      Sure. At what step are you having troubles?
      The main idea is to create the result in an almost greedy manner. You take one letter at a time from the input string and you compare it to the characters you have in the result output.
      The more tricky part of this algo comes when you already have the character in the output. We now start wondering about the lexicographically part of the problem. If you find a lexicographically greater char after it in the output, it triggers the possibility of finding a lexicographically greater result. The condition is that all the characters between these two characters (which are obviously lexicographically smaller) are to be founded after the character which is lex. greater, in the input string. This means that all of this chars will be treated in the next iterations and will pe replaced in the output string, after the char which is lex. greater. If one char is not founded in the input string, it will remain at it’s current position, and the output string will be lexicographically smaller instead.
      Please let me know if this explanation answered your questions.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s