Find largest 1M numbers in 1B numbers

Problem Describe an algorithm to find the largest 1 million numbers in 1 billion numbers. Assume that the computer memory can hold all one billion numbers. Solutions Method 1 - Sorting We can certainly sort those numbers and return the last 1 million of them. That takes O(n log n). Method 2 - Sort partially using some pivot If we think about it, we actually do not need to sort them. [Read More]

Shortest distances between two words in a file

Problem You have a large text file containing words. Given any two words, find the shortest distance (in terms of number of words) between them in the file. Can you make the searching operation in O(1) time? What about the space complexity for your solution? OR Find the minimum distance between 2 numbers in the array? Example File contains: as was is the as the yahoo you me was the and [Read More]

Count the number of 2s between 0 and n

Problem Write a method to count the number of 2s between 0 and n. Example Input Range : between 0-20, Output : 3 (i.e 2, 12 , 20)) Solution We need recursion. The number n can be broken down into a couple of ranges. The number of 2’s of numbers in each range is independent of each other. For example, 3345 can be broken into [0, 3000] and (3000, 3345]. [Read More]

Randomly generate m integers from an array of size n

Problem Write a method to randomly generate a set of m integers from an array of size n Each element must have equal probability of being chosen. Solution This question depends on 2 things: how random number generation works? Fisher yates algorithm or Knuth Shuffle? Now that we have fisher yates algorithm in our quiver, we can solve this problem : let m be the number of elements to select for i = 1; i <= m; i++ pick a random number from 1 to n, call it j swap array\[j\] and array \[n\] (assuming 1 indexed arrays) n-- public static int\[\] chooseNFromM(int\[\] array, int m) { if (m > array. [Read More]

Find pairs of integers in an array that sum to a value

Problem Design an algorithm to find all pairs of integers within an array which sum to a specified value. Solution  We have found a pair of elements in an array which sum upto a given number already, this is a known problem called 2-sum problem. We know we can solve this problem in O(n) time. Let T be the sum. Method 1 - Using Hash table solution [Read More]

Implement rand7() using rand5()

Problem Write a method to generate a random number between 1 and 7, given a method that generates a random number between 1 and 5. (i.e., implement rand7() using rand5()). Solution This appear to be one of those probabilistic analysis questions. You should be familiar with the concept of expected value, as it could be extremely helpful in probabilistic analysis. This solution is based upon Rejection Sampling. The main idea is when you generate a number in the desired range, output that number immediately. [Read More]

Algorithm Analysis

In computer science, the analysis of algorithms is the determination of the amount of resources (such as time and storage) necessary to execute them. Most algorithms are designed to work with inputs of arbitrary length. Usually, the efficiency or running time of an algorithm is stated as a function relating the input length to the number of steps (time complexity) or storage locations (space complexity). In computer science there can be multiple algorithms exist for solving the same problem (for example, sorting problem has many algorithms like insertion sort, selection sort, quick sort and many more). [Read More]

Variables and Data Types Introduction

In computer science and computer programming, a data type or simply type is a classification identifying one of various types of data, such as Real, Integer or Boolean, that determines the possible values for that type; the operations that can be done on values of that type; the meaning of the data; and the way values of that type can be stored. Variables: Variables are placeholders used to store values; they have names and data types. [Read More]

Encode XML

Problem Since XML is very verbose, you are given a way of encoding it where each tag gets mapped to a pre-defined integer value The language/grammar is as follows: Element –> Element Attr* END Element END [aka, encode the element tag, then its attributes, then tack on an END character, then encode its children, then another end tag] Attr –> Tag Value [assume all values are strings] END –> 01 [Read More]

Find the frequency of a word in a book

Problem Design a method to find the frequency of occurrences of any given word in a book. Solution Just scan the word, trim it and put in hashtable. Now, when searching lower case it and search it. Hashtable<String, Integer> setupDictionary(String\[\] book) { Hashtable<String, Integer> table = new Hashtable<String, Integer>(); for (String word : book) { word = word.toLowerCase(); if (word.trim() != "") { if (!table.containsKey(word)) table.put(word, 0); table.put(word, table. [Read More]