Skip to content

Latest commit

 

History

History
346 lines (252 loc) · 16.1 KB

en.md

File metadata and controls

346 lines (252 loc) · 16.1 KB

LOJ 1013 - Love Calculator


Tags : String, Subsequence, Dynamic Programming, Memoization

We need to find some shortest possible Strings(meaning that they won't have any unnecessary suffix or prefix) that can produce the names as Subsequences and we have to output :

  1. How many unique Strings are there?
  2. What is the length?

Helpful Resources

Solution

The problem statement has already mentioned that the length of the names (Strings) will not cross 30. We will take two 2D arrays of size 31 x 31 (although we just need length of name1 * length of name2 for each test case but let's keep it this way just for the sake of ease of implementation and since neither it consumes a lot of memory),

  • one for memoizing the length of the shortest String
  • another one to keep record of how many unique Strings are generated.

But first let us understand what type of Strings we are counting and to do that let's take an example:

USA
USSR

For the above example (and others), all (meaning that they may contain unnecessary suffix and prefix) the possible Strings whose Subsequences can produce both the name is actually inifinite. For example:

USSRA
abcdUSSRApzxockbvbkpvzxco
USASRadasdasd
USSAR
fwetUSSARrweUSASRadasdasd
dasdsadasdsadcwqrfqtgUSASR
gjoUSSRApidhUSSARglkdfg
dfgdgfUSSAR
USSARgnsdihgisgf
...
...

But the problem has asked us to find the shortest Strings whose Subsequences can produce both of the names. So we don't keep any unnecessary suffix or prefix. For distinction only and to make them easily detectable, some Substrings are capitalized in each Strings shown above, from which we can obtain both USA and USSR.

For example : From the String fwetUSSARrweUSASRadasdasd, we can take any of the Substrings USSAR or USASR which both of them can produce the Subsequences USA and USSR. Remember that both the Substrings and Subsequences must maintain the order of characters of its original String. The only difference between Substrings and Subsequences is that Substrings must contain consecutive characters too while Subsequences has NO such bound.

Now if we just think of shortest unique Strings for this particular test case, those would be only these 3 :

USSRA
USSAR
USASR

And as already explained before, RAUSS or similar Strings that contain all the characters of both the names but if the characters to produce the names can NOT be retrieved in a Sequential order, are not Subsequences at all, thus we are not counting them nor they are our concern.

Now that we have finally understand what type of Strings we are looking for, let's understand how we are using the arrays to memoize and find the solutions.

How are we keeping track of common characters (shortest length of the substring)?

At first we will take the names and shift their characters right by 1 index for ease of comparison and implementation in the upcoming steps. Also because at length == 0 they won't have any common characters at all. As we have shifted the characters to right by 1 index, then we simply create both the matrix for dynamic programming by occupying the first column and first row. We can we do it in any fashion we want. Row and Column represents the characters of each of names. I have chosen Row to represent the first name and the Column to represent the second name, it can be done other way around as well.

For the shortest Strings matrix, we will put the max(row,column) value to each of blocks in j-th column and i-th row. For example: for (4,0) and (0,4), 4 is the maximum of both the values so those two blocks will have 4 as their value. We are doing this because we are assuming name1 == name2 at the initial stage.

The shortestString matrix should look like something like this at the initial state:

I/J 0 1 2 3 4 ... 30
0 0 1 2 3 4 ... 30
1 1
2 2
3 3
4 4
... ...
30 30

Now to memoize the shortestString matrix, we must check the characters of a name in respect to the other one. Each block represents the common length of the shortest string required for both of the names' 1st character to those particular indice of those names respectively. (Remember, we already shifted the characters by 1 character.)

For Example : shortestString[1][3]'s value is the length required when I am taking name1's to substring length = 1 (USA's length 1 substring == "U") and name2's substring length = 3 (USSR's length 3 substring == "USS"). So, shortestString[1][3] = 3 is required as the bare minimum length for U and USS.

i/j 0 1(U) 2(S) 3(S)
0 0 1 2 3
1(U) 1 1(U) 2(US) 3(USS)

We just simply keep adding 1 to the last updated cumulative sum. Let's understand what to do on which condition :

  • Case name1[i] == name2[j] : Since both of the characters are same, we don't need to extend the string length. We can just add 1 to previous common length from upper-left block, which is shortestString[i-1][j-1]. Look at the table below for clarification of an example:

    • Example :

      i[2] == j[3]

      For "US" of "USA" (since i == 2) and "USS" of "USSR" (since j == 3)

      So,

      shortestString[2][3]
      = shortestString[2-1][3-1] + 1
      = 2 + 1 = 3
      
  • Case name1[i] != name2[j] : Since the characters don't match, we will take the minimum (look at example to understand why not maximum) length encountered between first name's length and second name's length. For clarification let us again look at the table with an example:

    • Example :

      i[1] != j[2]

      So,

      shortestString[2][3]
      = min(shortestString[2][2],shortestString[1][3]) + 1
      = shortestString[2][2] + 1 = 3
      
      

      if we had taken maximum then it had been

      shortestString[2][3]
      = shortestString[1][3] + 1 = 4
      

      But do we really need USSS to get name1's 2 length substring US and name2's 3 length substring USS? In other words, by taking maximum we are actually considering US and USSS which is because USS is already included in shortestString[1][3]; thus unnecessary suffix has been added.

In example of USA and USSR, it will look something like this after memoization.

i/j 0 1(U) 2(S) 3(S) 4(R)
0 0 1 2 3 4
1(U) 1 1(U) 2(US) 3(USS) 4(USSR)
2(S) 2 2(US) 2(US) 3(USS) 4(USSR)
3(A) 3 3(USA) 3(USA) 4
(USA + S)
[from j=3]
/ (USS + A)from[i=3]
=(USAS/USSA)
5
((USAS/USSA) + R))
/(USSR + A)
=(USASR/USSAR/USSRA)

And shortestString[name1.length()-1][name2.length()-1] give us the length of the shortest string which is 5.

How are we keeping track of unique strings?

For the unique String matrix, we occupy the 1st column and 1st row with 1s because we are assuming that the number of unique string is 1, simply name1 = name2 meaning there is no difference of characters.

The uniqueString matrix should look like this initially:

I/J 0 1 2 3 4 ... 30
0 1 1 1 1 1 ... 1
1 1
2 1
3 1
4 1
... ...
30 1

While populating there can be these cases :

  • Case name1[i] == name2[j] : When the characters matches, we don't need to create/branch out to another new sequence and thus we simply just update the value of the current block by taking the number of unique string from upper-left block.

  • Case name1[i] != name2[j] && shortestString[i][j - 1] == shortestString[i - 1][j] : It means there had been 2 unique strings sequence of the same length as we can already see in the shortestString matrix. And we need to take both the sequence in account.

    • Example :

      shortestString[3][3] is populated by this way. How? name1 length 3 substring is USA and name2 length 3 substring is USS. shortestString[2][3] is USS and shortestString[3][2] is USA and neither of them can produce the other name's substring upto that length. So we can add the unique sequence character A of name1 to name2's length 3 substring USS+A; also if we do it the other way around, add the unique sequence character S of name2 to name1's length 3 substring USA+S, is valid and has to be counted as per question requirement. So, we are adding the number of unique strings found in those states.

      shortestString[3][3]
      = number of unique strings found for name1 length 3 substring
      + number of unique strings found for name2 length 3 substring
      = shortestString[3][2] + shortestString[2][3] = 2
      
      
  • Case name1[i] != name2[j] && shortestString[i][j - 1] != shortestString[i - 1][j] : In this case we just simply take the minimum of which can be taken from left block or upper block. The logic is similar to shortestString[][] population. We don't want unnecessary suffix to be added.

If we have done everything accordingly, uniqueString matrix should look like this :

i/j 0 1 2 3 4
0 1 1 1 1 1
1 1 1 1 1 1
2 1 1 1 1 1
3 1 1 1 2 3

Now, uniqueString[name1.length()-1][name2.length()-1] tells us how many unique Strings are there.

The above implementation is accepted.

Caution : The matrix to memoize the number of unique strings, must be of an integer data type that can hold 263.

Notes :

  • You can solve the problem recursively too and it won't throw any MLE.
  • long data type's size of c++ is dependent on the hardware (processor) and the software (operating system and compiler). LightOJ uses a 64-bit compiler so only long is sufficient. However we might need to use long long if any of the dependency is 32-bit.

Solution in C++ (Iterative)

#include <bits/stdc++.h>

using namespace std;

int main()
{
    ios::sync_with_stdio(false);
    cin.tie(NULL);
    cout.tie(NULL);

    int testCases;
    string name1, name2;

    int shortestString[31][31];
    long uniqueString[31][31];

    cin >> testCases;

    for (int testCase = 1; testCase <= testCases; testCase++)
    {
        cin >> name1 >> name2;


        //Shift the characters of the name to right for ease of memoizing
        name1.insert(0, "0");
        name2.insert(0, "1");

        //Prepare the matrices for memoization
        for (int i = 0; i < 31; i++)
            shortestString[0][i] = shortestString[i][0] = i, uniqueString[i][0] = uniqueString[0][i] = 1;

        for (int i = 1; name1[i]; i++)
        {
            for (int j = 1; name2[j]; j++)
            {
                //Checking if we need to take the cumulative sum from upper-left block
                if (name1[i] == name2[j])
                {
                    //Adding 1 to cumulative sum from upper-left block
                    shortestString[i][j] = 1 + shortestString[i - 1][j - 1];

                    //No need to add a new branch of unique strings so taking cumulative sum from upper-left block
                    uniqueString[i][j] = uniqueString[i - 1][j - 1];

                }
                else
                {
                    //Finding the minimum from left and upper block and adding 1 to the value of current block
                    shortestString[i][j] = 1 + min(shortestString[i][j - 1], shortestString[i - 1][j]);

                    //Checking if there are two unique strings of the same length
                    if (shortestString[i][j - 1] == shortestString[i - 1][j])
                        uniqueString[i][j] = uniqueString[i][j - 1] + uniqueString[i - 1][j];

                        //Checking if left block has the minimum value in shortestString matrix
                    else if (shortestString[i][j - 1] < shortestString[i - 1][j])
                        uniqueString[i][j] = uniqueString[i][j - 1];
                    else
                        uniqueString[i][j] = uniqueString[i - 1][j];
                }
            }
        }
        cout << "Case " << testCase << ": " << shortestString[name1.length() - 1][name2.length() - 1] << " " << uniqueString[name1.length() - 1][name2.length() - 1] << "\n";
    }
    return 0;
}

Solution in C++ (Recursive)

#include <bits/stdc++.h>

using namespace std;

string name1, name2;
int shortestString[31][31];
long uniqueString[31][31];

void memoizeArrays(int i, int j){
    if(i == name1.length()) //base case
        return;

    if(j == name2.length()) //check next row and set column to 1
        return memoizeArrays(++i, 1);

    else{
        if (name1[i] == name2[j]){

          //Adding 1 to cumulative sum from upper-left block/current common length of both string
          shortestString[i][j] = 1 + shortestString[i - 1][j - 1];

          //No need to add a new branch of unique strings so taking cumulative sum from upper-left block
          uniqueString[i][j] = uniqueString[i - 1][j - 1];
        }else{

          //Finding the minimum from left and upper block and adding 1 to the value of current block
            shortestString[i][j] = 1 + min(shortestString[i][j - 1], shortestString[i - 1][j]);

          //Checking if there are two unique strings of the same length
            if (shortestString[i][j - 1] == shortestString[i - 1][j])
                uniqueString[i][j] = uniqueString[i][j - 1] + uniqueString[i - 1][j];
              //Checking if left block has the minimum value in shortestString matrix
            else if (shortestString[i][j - 1] < shortestString[i - 1][j])
                uniqueString[i][j] = uniqueString[i][j - 1];
            else
                uniqueString[i][j] = uniqueString[i - 1][j];
        }

        //Check next column of the same row
        return memoizeArrays(i, ++j);
    }
}

int main()
{
    ios::sync_with_stdio(false);
    cin.tie(NULL);
    cout.tie(NULL);

    int testCases;

    cin >> testCases;

    for (int testCase = 1; testCase <= testCases; testCase++)
    {
        cin >> name1 >> name2;

        //Shift the characters of the name to right for ease of memoizing
        name1.insert(0, "0");
        name2.insert(0, "1");

        //Prepare the matrices for memoization
        for (int i = 0; i < 31; i++)
            shortestString[0][i] = shortestString[i][0] = i, uniqueString[i][0] = uniqueString[0][i] = 1;

        //Memoize the array
        memoizeArrays(1, 1); //Remember, we shifted characters to the right by 1 character
        cout << "Case " << testCase << ": " << shortestString[name1.length() - 1][name2.length() - 1] << " " << uniqueString[name1.length() - 1][name2.length() - 1] << "\n";
    }
    return 0;
}