Tag Archive for 'programming'

Knuth-Morris-Pratt String Search Algorithm

The KMP algorithm searches for the occurrence of a word W in a sentence S while employing the observation that when a mismatch occurs W contains sufficient information to determine the start position of the next search in S.

Say W = abcd and S = abcbabcd. Now we compare W[0] with S[0], W[1] with S[1], W[2] with S[2] and they all match. But W[3] does not match S[3]. The brute force algorithm would have started the next search by comparing W[0] with S[1]. But a simple observation tells us that it would be fruitless as there is no W[0] till S[3]. We would have done better by starting the next search from S[4]. And this is exactly what the KMP tries to achieve.

KMP requires a pre-processing of the word W. We create a Partial Match Table by iterating over all the letters of W in such a way that in case of a mismatch we can quickly figure out the next start position by looking at the table. The wiki article on KMP describes with an example how to create the table:

KMP – Partial Match Table

The partial match table can be created in O(n) time using O(n) space, where n is the length of the word W. Once we have the table we can iterate over each element of S consulting the table every time a mismatch occurs. You may want to read up the algorithm with examples from

KMP – Algorithm With Examples – Wikipedia

The search happens in O(k) where k is the length of the sentence S. Thus the algorithm runs in O(n + k).

Below is my java implementation of the same:

/**
* Perform the Knuth-Morris-Pratt string search. The algorithm first pre-processes the word to create a Partial Match
* Table. The table is creatd in O(n) where n is the length of the word. After the pre-processing, the actual search can
* be performed in O(k) where k is the size of the sentence.
*/
public class KMPStringSearch {
/**
* Searches for all occurances of the word in the sentence. Runs in O(n+k) where n is the word length and k is the
* sentence length.
*
* @param word The word that is being searched
* @param sentence The collections of word over which the search happens.
* @return The list of starting indices of the matched word in the sentence. Empty list in case of no match.
*/
public List<Integer> searchString(final String word, final String sentence) {
final List<Integer> matchedIndices = new ArrayList<>();

final int sentenceLength = sentence.length();
final int wordLength = word.length();
int beginMatch = 0; // the starting position in sentence from which the match started
int idxWord = 0; // the index of the character of the word that is being compared to a character in string
final List<Integer> partialTable = createPartialMatchTable(word);
while (beginMatch + idxWord < sentenceLength)
if (word.charAt(idxWord) == sentence.charAt(beginMatch + idxWord)) {
// the characters have matched
if (idxWord == wordLength - 1) {
// the word is complete. we have a match.
matchedIndices.add(beginMatch);
// restart the search
beginMatch = beginMatch + idxWord - partialTable.get(idxWord);
if (partialTable.get(idxWord) > -1) idxWord = partialTable.get(idxWord);
else idxWord = 0;
} else idxWord++;
} else {
// mismatch. restart the search.
beginMatch = beginMatch + idxWord - partialTable.get(idxWord);
if (partialTable.get(idxWord) > -1) idxWord = partialTable.get(idxWord);
else idxWord = 0;
}

return Collections.unmodifiableList(matchedIndices);
}

/**
* Creates the Partial Match Table for the word. Runs in O(n) where n is the length of the word.
*
* @param word The word whose Partial Match Table is required.
* @return The table as a list of integers.
*/
public List<Integer> createPartialMatchTable(final String word) {
if (StringUtils.isBlank(word)) return Collections.EMPTY_LIST;

final int length = word.length();
final List<Integer> partialTable = new ArrayList<>(length + 1);
partialTable.add(-1);
partialTable.add(0);

final char firstChar = word.charAt(0);
for (int idx = 1; idx < word.length(); idx++) {
final int prevVal = partialTable.get(idx);
if (prevVal == 0) {
if (word.charAt(idx) == firstChar) partialTable.add(1);
else partialTable.add(0);
} else if (word.charAt(idx) == word.charAt(prevVal)) partialTable.add(prevVal + 1);
else partialTable.add(0);
}

return Collections.unmodifiableList(partialTable);
}
}

Popularity: 3% [?]

Your ClassNotFound Error Is Probably Not Telling You Everything

Quicktip
If a class fails to load due to an exception during class initialization, the actual problem is only logged the first time you attempt to load the class.  After the first time, the classloader recognizes that it’s already tried to load the class and just throws a ClassNotFoundException or NoClassDefFoundError.

Symptom
You will see logs for the ClassNotFoundException or NoClassDefFoundError (usually many such logs), but you see that the class is in the classpath.  You won’t see any root cause on any stack trace but the first (and that one is typically not a CNFE or NCDFE).

Finding the root cause
If you get a CNFE or NCDFE and you see the class in the classpath, search back your logs for ${missing classname}.<clinit> in a stack trace to figure out what prevented the class from being loaded.  Remember that the log may have rolled off.

Possible root causes for CNFE/NCDFE

  • Desired class is not in classpath (this is the boring case)
  • Initialization of desired class throws a RuntimeException or an Error
    • Static variable is initialized via a function which threw an uncaught Throwable
      • E.g. public static final String SOME_CONST = SomeClass.getString(“SOME_KEY”); where SomeClass.getString(String key) can throw an Exception.
    • Static block (loose code in {} in the class definition) threw an uncaught Throwable
      • E.g. public class Foo { { doSomeStaticInitialization(); } …
    • variable or method signature includes type which could not be initialized (see this same list of root causes)
      • E.g. public class Foo { SomeClassWithErrorInInitializer attr1; …

From The SDE Tip – Amazon

Popularity: 1% [?]

Project Euler : Frequency Analysis For Message Decryption

Problem 59 of Project Euler brought back memories of Sherlock Holme’s Dancing Men. Holmes had to decrypt a series of messages – the characters of which were stick men figures in varying dancing positions. He computed the frequencies of each dancing figure. Then since ‘e’ is the most used letter in English, he replaced the most frequent dancing figure with ‘e’.

I did the same thing with a difference. The given text in the problem did not have any spaces. This meant that even the spaces had been encrypted. And the most frequent number occurred almost twice as much as the next one. So I replaced that with space. That’s it. The problem was solved.

The password used to encrypt the text is – ‘god’ and the following code decrypts the text and finds the answer in the count variable.


    private List<Distribution<Integer>> parseInputFile(String fileName) throws FileNotFoundException {
        List<Distribution<Integer>> frequencies = Lists.newArrayList(new Distribution<Integer>(), new Distribution<Integer>(), new Distribution<Integer>());

        Scanner scanner = new Scanner(new File(fileName)).useDelimiter(",");
        int count = 0;
        for(int i = 0;  scanner.hasNext(); i++) {
            int num = scanner.nextInt();
            frequencies.get(i % 3).add(num);

            if(i % 3 ==0) num ^= 103;
            else if(i%3 == 1) num ^= 111;
            else if(i%3 == 2) num ^= 100;
            count += num;
            System.out.printf("%c", num);
        }

        System.out.println("Sum = " + count);

        return frequencies;
    }

Popularity: 5% [?]

Project Euler : Expansion Of Continued Fraction For Square Root Of Two

Problem 57 of Project Euler is pretty simple forward. It is very easy to see the following relation:

Numerator = Numerator + 2 * Denominator

Denominator = Numerator + Denominator

Simple code and was done. Here’s that:

    public int getCount(int numberOfExpansions) {
        int count = 0;

        BigInteger numerator = BigInteger.valueOf(3);
        BigInteger denominator = BigInteger.valueOf(2);
        while(--numberOfExpansions > 0) {
            BigInteger a = denominator.multiply(NumberUtil.BIG_TWO).add(numerator);
            BigInteger b = numerator.add(denominator);
            numerator = a;
            denominator = b;
            if(NumberUtil.numberOfDigits(numerator) > NumberUtil.numberOfDigits(denominator)) count++;
        }

        return count;
    }

Popularity: 4% [?]

Project Euler : n Digit Integers That Are Also nth Power

Problem number 63 of Project Euler gives the expression a = b^c and imposes the condition that the number of digits in a is equal to c. An example would be 16807 = 7^5. ab and c are all positive integers. The problem asks one to find the number of such possibilities.

Solving this problem made me happy. So, I’ll go into the details.

Given the problem, the first thing that comes to mind is the special relationship between the log of a number and the number of digits. The hint is to use logarithms to simplify the problem.

a = b^c

\Rightarrow \log(a) = c \log(b)

But \log(a) = c - f, where f \in (0, 1]

\Rightarrow c - f = c \log(b)

\Rightarrow c \log(\frac{10}{b}) = f

\Rightarrow 0 < \log(\frac{10}{b}) \leq \frac{1}{c}, dividing by c \in Z^+

That’s it! Our problem has been solved. I hope you have already noted that b \in [1, 9]. This means that we can precompute the values of \log(\frac{10}{b}) and store the results in an array. Then for all c \geq 1, just count how many of the precomputed values are smaller than or equal to \frac{1}{c}. The iteration completes when the smallest precomputed value is greater than \frac{1}{c}.

Below is the piece of code I wrote in Java to solve the task.

    public int getCount() {
        m_precomputedValues = new double[9];
        for(int i = 1; i < 10; i++) {
            m_precomputedValues[i-1] = Math.log10(10.0 / i);
        }

        int count = 0;

        int numberOfDigits = 1;
        double val = 1.0 / numberOfDigits;
        while (val > m_precomputedValues[8]) {
            for(int idx = 8; idx >= 0 &amp;amp;&amp;amp; m_precomputedValues[idx] <= val; idx--) count++;
            val = 1.0 / ++numberOfDigits;
        }

        return count;
    }

Popularity: 4% [?]

Project Euler : Count Of Lychrel Numbers Below 10000

Problem 55 of Project Euler explains what Lychrel numbers are and asks one to find the number of those below 10000. The only sticking point in the whole procedure is that after a few iterations the number become so large that BigInteger needs to be used. But otherwise, just follow the wording of the statement and the problem will be solved.

Here’s the outermost loop which does the counting:

    private static final int ARRAY_SIZE = 1000000;
    private static final BigInteger BIG_ARRAY_SIZE = BigInteger.valueOf(ARRAY_SIZE);

    public static int countOfLychrelNumbers(int upperLimit) {
        int count = 0;

        State[] lychrelNumbers = ArrayUtils.createArray(ARRAY_SIZE, State.NOT_VERIFIED, new State[0]);
        for (int i = 0; i <= upperLimit; i++) {
            if (lychrelNumbers[i] == State.NON_LYCHREL) continue;
            if (lychrelNumbers[i] == State.LYCHREL || testForLychrelProperty(lychrelNumbers, i)) {
                count++;
            }
        }

        return count;
    }

    private enum State {
        LYCHREL, NON_LYCHREL, NOT_VERIFIED
    }

It’s in the testForLychrelProperty method that all the caching logic comes in. It makes sense to cache the results. However, I have imposed a cache limit as well. Here’s the messier code with caching logic:

    private static boolean testForLychrelProperty(State[] lychrelNumbers, int number) {
        int iterCount = 1;
        State state = State.NOT_VERIFIED;
        BigInteger num = BigInteger.valueOf(number);
        List<BigInteger> found = new ArrayList<BigInteger>();
        found.add(num);
        while (iterCount < 50) {
            BigInteger reverse = NumberUtil.reverse(num);
            BigInteger sum = reverse.add(num);
            if (!num.remainder(BigInteger.TEN).equals(BigInteger.ZERO)) found.add(reverse);
            state = sum.compareTo(BIG_ARRAY_SIZE) < 0 ? lychrelNumbers[sum.intValue()] : State.NOT_VERIFIED;
            if (state == State.LYCHREL || state == State.NON_LYCHREL) {
                break;
            }
            if (PalindromeUtil.checkIfPalindrome(sum.toString())) {
                state = State.NON_LYCHREL;
                break;
            }
            num = sum;
            found.add(sum);
            iterCount++;
        }

        if (state == State.NOT_VERIFIED) state = State.LYCHREL;

        setState(lychrelNumbers, found, state);
        return state == State.LYCHREL;
    }

The setState method simply updates the cache with the latest result. Here is it’s implementation:

    private static void setState(State[] lychrelNumbers, List<BigInteger> numbers, State state) {
        for (BigInteger number : numbers) {
            if (number.compareTo(BIG_ARRAY_SIZE) > 0) return;
            lychrelNumbers[number.intValue()] = state;
        }
    }

All in all, the code runs in some 180ms on my laptop.

Popularity: 6% [?]

Project Euler : Largest Sum Of Digits In Exponentiation

The problem 56 of project euler requires one to sum the digits of the exponentiation a^b, where 1<= a,b <= 100, and find the largest sum.

Well, let’s just say I did it by proper brute force. Here’s my java code:

    public static int largestSumOfDigitsInExponentiationNew(int baseMax, int exponentMax) {
        int largest = 0;

        for (int i = 1; i <= baseMax; i++) {
            BigInteger num = BigInteger.valueOf(i);
            BigInteger prev = num;
            largest = Problem_56.checkLargestSum(prev, largest);
            for (int j = 2; j <= exponentMax; j++) {
                prev = prev.multiply(num);
                largest = Problem_56.checkLargestSum(prev, largest);
            }
        }

        return largest;
    }

    private static int checkLargestSum(BigInteger number, int curentLargest) {
        int sum = NumberUtil.getSumOfDigits(number);
        return curentLargest < sum ? sum : curentLargest;
    }

Popularity: 5% [?]

Project Euler : Finding Minimal Sum Path In A Matrix

The problem 81 of Project Euler gives a square matrix and requires one to find the minimal sum path. Specifically, it only required the sum of the numbers on the path. The question immediately reminded me of problem 67 and I set out solving with the same logic.

The logic has been explained in the following post of mine: Project Euler : Maximum Sum Traversing Top To Bottom In A Triangle.

The following lines of code did the work for me:

    public static int getMinimalSumPath(int[][] numbers) {
        int dimension = numbers.length - 1;
        for (int x = dimension; x >= 0; x--)
            for (int y = dimension; y >= 0; y--) {
                if (x == dimension &amp;amp;&amp;amp; y == dimension) continue;
                int val1 = x < dimension ? numbers[x + 1][y] : Integer.MAX_VALUE;
                int val2 = y < dimension ? numbers[x][y + 1] : Integer.MAX_VALUE;
                numbers[x][y] += val1 < val2 ? val1 : val2;
            }

        return numbers[0][0];
    }

Popularity: 13% [?]

Project Euler : Fractions With Unorthodox Cancelling Method

Problem 33 in Project Euler says there are 4 non-trivial curious fractions which satisfy the following properties:

\frac{ax}{xb} = \frac{a}{b}

where a, b and x are all single digit numbers and a < b

The solution was pretty simple.

\frac{ax}{xb} = \frac{a}{b}

\Rightarrow b [10a + x] = a [10x + b]

\Rightarrow x = \frac{9ab}{10a - b}

Similarly solving for the other form \frac{xa}{bx} = \frac{a}{b} gives the answer x = \frac{9ab}{10b - a}.

Implementing this solution in java:

public class Problem_33 {
	/**
	 * @return All the proper non-trivial curious fractions having numerator, denominator < 100
	 */
	public static List<Fraction> getCuriousFractions() {
		List<Fraction> curiousFractions = new ArrayList<Fraction>();

		for (int a = 1; a < 10; a++)
			for (int b = a + 1; b < 10; b++)
				if (Problem_33.checkCondition(a, b))
					curiousFractions.add(new Fraction(a, b));

		return curiousFractions;
	}

	/**
	 * Checks if the number is of the form xa/bx = a/b or ax/xb = a/b
	 *
	 * @param a The numerator of the fraction
	 * @param b The denominator of the fraction
	 * @return true if it can be formed into a curious fraction, false otherwise
	 */
	private static boolean checkCondition(int a, int b) {
		int numerator = 9 * a * b;
		int denominatorA = 10 * b - a;
		int denominatorB = 10 * a - b;
		return numerator % denominatorA == 0 && numerator / denominatorA < 10 || numerator % denominatorB == 0
		        && numerator / denominatorB < 10 ? true : false;
	}
}

Popularity: 9% [?]

Project Euler : Find Hexagonal Pentagonal Number

Problem 45 of Project Euler required one to find numbers which are triangular, pentagonal and hexagonal together. Checking for triangularity is redundancy as all hexagonal numbers are also triangular. So it boils down to solving the equation:

n(3n – 1) / 2 = m(2m – 1)

Using the Diophantine Equation solver at http://www.alpertron.com.ar/QUAD.HTM, we get the solution as

n_{i+1} = 97 \times n_i + 112 \times m_i - 44

m_{i+1} = 84 \times n_i + 97 \times m_i - 38

The solution comes very easily from here.

public class Problem_45 {
	/**
	 * Find a number which is Pentagonal and also Hexagonal
	 *
	 * @param index The nth number which is required
	 * @return The nth number which is both Pentagonal and Hexagonal
	 */
	public static int findPentaHexaNumber(int index) {
		int p = 1;
		int h = 1;
		do {
			// This is the solution to the Diophantine Equation
			// p(3p-1)/2 = h(2h-1)
			int p2 = 97 * p + 112 * h - 44;
			h = 84 * p + 97 * h - 38;
			p = p2;
			index--;
		} while (index != 0);

		return h * (2 * h - 1);
	}
}

Popularity: 8% [?]