Thursday, December 8, 2011

Python File Extensions

Now, I am going to work on a new project with Python! Although I know quite a little about Python 2.x, but Python 3 will be the trend and future, and I am going to use it this time. Hopefully will pick it up very soon!

As blog notes, I got some information about Python related file extensions. Please see this site for details about file extensions.

File ExtensionFile TypeFile Description
.pyDeveloper
  • Program file or script written in Python
  • Can be created and edited with a text editor, but requires a Python interpreter to run
  • Often used for programming Web servers and other administrative computer systems
.pycExecutable
  • Python complied file
.pydDeveloper
  • Python dynamic module
  • Can be imported into Python programs
.pymDeveloper
  • PYM macro preprocessor file
  • Used by PYM, a macro preprocessor based on the Python programming language
  • Contains macros defined between "#begin python" and "#end python" markers; used to create shortcuts for programmers
  • PYM files store macro shortcuts that are expanded to their full text replacements during the Python interpreter's preprocessing stage
.pyoExecutable
  • Python optimized code
  • Similar to a .pyc file, but assert statements and SET_LINENO instructions are removed
.pywDeveloper
  • Python GUI source file
  • Displays a graphical user interface (GUI)
  • May contain standard Python code as well as graphical Python code
  • Must be run using pythonw instead of python in order to prevent a DOS console/terminal from popping up to display the output

References:
FileInfo.com
File Extension PYW

Tuesday, December 6, 2011

Loop & Its Entrance in Singly Linked List

Problem:
Given a singly linked list, check whether there is a loop in it. If yes, find the entrance to the loop.

Thinking:
Use 2 pointers, one fast and one slow. If at some time, these 2 pointers meet, then there is a loop; if fast pointer arrives at the end of the list, there is no loop.

Suppose we have above list structure. Assume loop length is \(r(r\in N)\), and the length before the loop (from \(A\) to \(B\)) is \(l\). Always, one can represent \(l\) as \(l=k\cdot r+p\), where \(k=0,1,2,...\) and \(p=0,1,2,...,r-1\).

Then, suppose the moving speed of "fast" is \(a\) times of "slow", where \(a>1\) and \(a\in N\). If "slow" moves one step a time, "fast" will move \(a\) steps a time.

If there is a loop, where will the 2 pointers meet? Suppose they will meet after "fast" finishes \(n\) loops and "slow" finishes \(m\) loops. And suppose also, at this moment, they are at point C, which is of \(q\) steps after the loop entrance \(B\). Here, \(n\ge1, m\ge0, m,n\in N, q=0,1,...,r\). Hence, the following equation holds: \[k\cdot r+p+n\cdot r+q=a(k\cdot r+p+m\cdot r+q)\] We need find values for \(m,n\) and \(q\) so that above equation holds. The equation can be rewritten as \[n\cdot r-a\cdot m\cdot r-k(a-1)r=(a-1)(p+q)\] We can always find a \(q\) such that \(p+q=r\), then \[n-am-k(a-1)=a-1\] \[n=am+(k+1)(a-1)\] For the first time they meet, \(m=0\) and then \[n=(k+1)(a-1)\] To be most efficient, \(a=2\) and \(n=k+1\).

Conclusion:
From above statement, we know that "fast" and "slow" will definitely meet during the first loop round for "slow". Also, at the moment they meet, they are at \(q=r-p\) steps after \(B\), which is equivalent to \(p\) steps before \(B\). Note that the head length \(l=k\cdot r+p\), which means that if two pointers start at \(A\) and \(C\) at the same time and both move one step a time, they will meet at the loop entrance \(B\) for the first time.

Below, I just show the code to find the loop entrance. Actually, the loop check is just part of and can be told from the return value of this program.

Are 2 Strings Anagrams or Not?

Problem:
Given 2 strings, determine if they are anagrams. Anagram means the 2 strings contain exactly the same characters and number of appearances. In other words, we can do a permutation to get the other string from one string.

Solution 1:
Sort the 2 strings first, then if they are anagrams, the sorted strings must be identical. Otherwise, they are not.

In C++, we may use the library function sort in the header < algorithm >. This function is comparison based and the average time complexity is \(O(n\log n)\).


Solution 2:
Since the character values are always in a fixed range, and in this case, counting sort is good approach. In this way, we may obtain \(O(n)\) time complexity and \(O(n)\) space complexity.


Solution 3:
Similar to the idea of counting sort, we don't actually sort the strings. All we need is the counts of characters appearing in the strings. If the counts for each character are equal, then the 2 strings are anagram. This solution also exhibit \(O(n)\) time complexity, but \(O(1)\) space complexity.


Solution 4:
Even more, we can just count the characters for one string, and make judgement while scanning the other. This approach also has \(O(n)\) time complexity and \(O(1)\) space complexity.
There is a bug in Solution 4. What if s1 is longer and contains all characters in s2? For example, s1="abcd", s2="ab".

Two possible solutions for this:
  1. Add a length check at the beginning. Two strings must be of the same length to be anagram.
  2. Add another round of check on the count array after scanning the second string. All entries should be 0.

Tuesday, November 29, 2011

Assignment between Class Objects

Some short notes about assignment between class objects.

Suppose we are considering the assignment a = b, where a if of class A type, and b is of class B type.
  1. The overloaded operator= function will be looked first. If there is any, the assignment will call this function.

  2. If no overloaded operator= function, and a and b are of same class type, the assignment will be member-wise shallow copy.

  3. If no overloaded operator= function, and class B is derived from class A (in other words, b is an a), the assignment will copy the base class members.

  4. If no overloaded operator= function, and a and b are of unrelated class types, the assignment will try to call copy constructor if any.

  5. If no overloaded operator= function, a and b are of unrelated class types, and no copy constructor available, the assignment will try to call overloaded type casting function if any.

  6. If none of above applies, then the compiler will report an error about the assignment.

Related Posts:
Type Casting in C++
Type Casting on Classes
static_cast, dynamic_cast, and reinterpret_cast

Constructors and Destructors

This post looks into some details about class constructors and destructors. First look at a piece of code in the following.


What's going on really with above code? Which objects were constructed and destructed? What's the order and logic relationship? Below is the output by above program. (The comments were added for explanation.)
default constructor: 1st           // initialize f1 with value "1st"
before calling func ...
copy constructor: 1st              // construct local object f in func
within function func ...
copy constructor: 1st              // initialize f2, no temporary object returned
destructor: 1st                    // destroy local object f
after calling func ...
default constructor: 3rd           // initialize f3 with value "3rd"
default constructor: default       // initialize f4 with default value
default constructor: 4th           // temporary object with value "4th"
f4.buf: 4th                        // after shallow copy to f4
destroying temporary object ...
destructor: 4th                    // destroy temporary object
f4.buf:                            // f4.buf becomes invalid
destroying local objects ...
destructor:                        // destroy f4
destructor: 3rd                    // destroy f3
destructor: 1st                    // destroy f2
destructor: 1st                    // destroy f1

If no copy constructor was defined, the function call may bring some problems because of shallow copy. When exiting the function func, the destroy of local variable f will also make the buf member of returned object invalid.

Accessibility of Class Members

In this post, I briefly summarize the accessibility of class members in C++.

Accesspublicprotectedprivate
the same classyesyesyes
derived class
public inheritance
yes
still public
yes
still protected
no
still private
derived class
protected inheritance
yes
change to protected
yes
still protected
no
still private
derived class
private inheritance
yes
change to private
yes
change to private
no
still private
friendyesyesyes
otheryesnono

References:
C++ Tutorial

Related Posts:
Friend Functions and Classes

References in C++

Some notes about references in C++. Also, see the new standard introduced in C++11 about rvalue references and move semantics.
  1. It is not possible to refer directly to a reference object after it is defined; any occurrence of its name refers directly to the object it references.
  2. Once a reference is created, it cannot be later made to reference another object; it cannot be reseated. This is often done with pointers.
  3. References cannot be null, whereas pointers can; every reference refers to some object, although it may or may not be valid.
  4. References cannot be uninitialized. Because it is impossible to reinitialize a reference, they must be initialized as soon as they are created. In particular, local and global variables must be initialized where they are defined, and references which are data members of class instances must be initialized in the initializer list of the class's constructor.
  5. Most compilers will support a null reference without much complaint, crashing only if you try to use the reference in some way.
  6. References become invalid if they refer to an object with automatic allocation which goes out of scope.
  7. References become invalid if they refer to an object inside a block of dynamic memory which has been freed.
  8. Undefined behavior and invalid reference (initialized by dereferencing a null pointer):
    int *ip = 0;
    int &ir = *ip;
  9. References exhibit polymorphic capabilities, which is similar to pointers.
  10. References not qualified with const can only be bound to addressable values, not rvalues.
  11. References not qualified with const cannot be bound to temporary objects.
  12. Never return references to local or temporary objects.
  13. It is unspecified whether or not a reference requires storage.
  14. There shall be no references to references, no arrays of references, and no pointers to references.
  15. The declaration of a reference shall contain an initializer except when the declaration contains an explicit extern specifier, is a class member declaration within a class declaration, or is the declaration of a parameter or a return type.
  16. A reference cannot be bound directly to a bitfield.
  17. Given typedef int & iRef;, the declaration const iRef ir=5; is incorrect. Here, the qualifier const will be ignored. However, the following code is correct.
    typedef const int & ciRef;
    ciRef ir = 5;
  18. For pointers, there may be both const pointers and pointers to const variables; for references, there is no const references, at least not directly. By nature, reference is const and cannot be reassigned. In this sense, the term "const reference" always refers to "reference to const variable".

References:
Wikipedia: Reference
Rvalue References and Move Semantics

Monday, November 28, 2011

Function Template Specialization

A good study example for function template specialization.


Some notes about function template specialization:
  1. Explicit template specialization should be defined in namespace scope, and hence not inline within class body.
  2. Class templates can be partially specialized, but partial function template specialization is not allowed.
  3. The template parameter list after function name may be omitted, or some trailing arguments omitted. However, it is recommended to keep the <> pair.
  4. Template parameter list is part of signature for function templates and their specializations. Pay attention to the differences between function overloading and partial specialization.
    // Suppose we have the function template
    template< typename T1, typename T2 >
    void func( T1 v1, T2 v2 ) { ... }
    
    // The following is considered as function template overloading and allowed
    template< typename T >
    void func( T v1, T v2 ) { ... }
    
    // The following is considered as partial specialization and not allowed
    template< typename T >
    void func< T, T >( T v1, T v2 ) { ... }
    
Related Posts:
Function Signatures
Class Templates

Inheritance and Order of Constructor Calls

For the code segment below, how many times of base class constructor will be called?

  1. For single inheritance, the order is always the same: base class constructor first, then its own constructor.
  2. For multiple inheritance, call the virtual base constructor first if any, then follow the order from left to right for non-virtual base constructors, and finally call its own constructor. Note that only a unique set of virtual base constructors will be called, which is exactly the purpose of virtual inheritance.

Example 1:
Output 1:
Base constructor
Foo1 constructor
Base constructor
Foo2 constructor
Multi constructor

Example 2:
Output 2:
Base constructor
Foo1 constructor
Base constructor
Foo2 constructor
Multi constructor

Example 3:
Output 3:
Base constructor
Base constructor
Foo1 constructor
Foo2 constructor
Multi constructor

Example 4:
Output 3:
Base constructor
Foo1 constructor
Foo2 constructor
Multi constructor

Sunday, November 27, 2011

static_cast, dynamic_cast and reinterpret_cast

See also the posts (1, 2) on type casting.
  1. static_cast can be used to cast between fundamental types.
    int i;
    float f;
    i = static_cast<int>(f);
    
  2. static_cast cannot be used to cast between pointers or references to fundamental types.
    int *ip, &ir;
    float f;
    ip = static_cast<int*>(&f);   // illegal
    ir = static_cast<int&>(f);    // illegal
    
  3. static_cast can be used to cast between class types if one of the class types has explicit conversion constructor or type casting operator function. For this case, static_cast is the same as explicit casting using parentheses or implicit casting using assignment.
  4. static_cast can be used to cast derived class type to base class type (public inheritance only), but cannot be used to cast base class type to derived class type if no defined conversion constructor or casting operator. Think about function arguments and exception catch, which are implicit type casting.
  5. static_cast can be used to cast between pointers or references to related classes (base and derived, public inheritance only). The compiler won't complain if casting base class pointer to derived class pointer. The overhead of type-safe check by dynamic_cast can be avoided. It's the programmers' responsibility to ensure safe conversions.
  6. static_cast cannot be used to cast for other cases, including between fundamental types and class types, between pointers or references to fundamental types and class types, or between pointers or references to unrelated classes.
  7. dynamic_cast can be used to cast pointers or references to related classes only (base and derived, public inheritance only). In other words, dynamic_cast won't work on non-pointer/reference types, or pointer/reference to fundamental types.
  8. When casting between pointers or references to unrelated classes, dynamic_cast will pass compile, but there will be run-time error.
  9. dynamic_cast can be always successful if casting derived class to base class (public inheritance only).
  10. When casting from base class to derived class, dynamic_cast will pass compile only when the base class is polymorphic. But there will be run-time error.
  11. dynamic_cast cannot be used to cast base class to derived class if the base class is not polymorphic. (compile error)
  12. reinterpret_cast can be used to cast between any pointer or reference types, even between those to unrelated classes or fundamental types or mix.
    int *ip, &ir;
    float f;
    ip = reinterpret_cast<int*>(&f);
    ir = reinterpret_cast<int&>(f);
    
  13. When using reinterpret_cast, neither the content pointed nor the pointer type itself is checked, and thus dereferencing it is unsafe.
  14. reinterpret_cast even works on any kind of inheritance, either public, protected, or private.
  15. reinterpret_cast can be used to cast between pointer types and integer types.

To demonstrate, I wrote a sample code.


Related Posts:
Type Casting in C++
Type Casting on Classes
Assignment between Class Objects

Thursday, November 24, 2011

Object Initialization for User-Defined Classes

First of all, to declare or initialize class objects, the class must have default constructors, either explicitly defined in source code or implicitly defined by the compiler. Note that a default constructor is a constructor that either has no parameters, or if it has parameters, all the parameters have default values.

When declaring objects (not references or pointers), we can either directly declare an object with no explicit initializer or parenthesis, or explicitly provide initializer with parenthesis. Note that although the default constructor was defined and we may call general default functions without providing any arguments, we cannot initialize objects by such function call scheme. In other words, we may not provide a pair of parenthesis while no explicit initializer arguments.

Suppose class SomeClass has default constructor, then
SomeClass object1;                 // legal
SomeClass object2(initializer);    // legal
SomeClass object3();        // illegal (compiler won't complain, 
                            // but it will consider object3 as a function)

When initialize object pointers using keyword new, there will be different behaviors for explicitly and implicitly defined default constructors.

For classes with explicitly defined default constructor, the following 2 schemes work exactly the same way.
SomeClass *pointer1 = new SomeClass;       // call default constructor
SomeClass *pointer2 = new SomeClass();     // call default constructor

However, for classes without explicitly defined default constructor, the 2 schemes differ in the initial value for object data members.
AnotherClass *pointer1 = new AnotherClass;       // undefined data members
AnotherClass *pointer2 = new AnotherClass();     // nullified data members

I wrote a sample code and tested with both g++ and MSVC, and I am attaching the output by MSVC in this post after the code.




Related Posts:
Array Declaration and Initialization
new and delete

Type Casting on Classes

In one previous post, I discussed about general type casting in C++. While in this post, I am going to talk about the casting on classes. More specifically, I won't discuss the keyword approach (static_cast, dynamic_cast, reinterpret_cast or const_cast) in this post.

Based on the is-a relationship between derived and base classes, implicit casting from derived class to base class is always possible for public inheritance (not possible for protected or private inheritance).
class Base {};
class Derived: public Base {};

Derived d;
Base b = d;

Without use of keywords, we may have 3 ways to convert unrelated class types.
  1. Corresponding constructor
  2. Overload assignment operator
  3. Overload type casting operator
Suppose we want cast an object of class A to class B, then we may need define 1 or 2 in B's body, while define 3 in A's body.

Here is a sample code I wrote for testing. Note that the "=" sign in declaration is not assignment operator. Initialization, assignment, and explicit casting are all legal for Approaches 1 and 3, but only implicit casting by assignment is legal for Approach 2 with overloaded operator "=".



Related Posts:
Type Casting in C++
static_cast, dynamic_cast, and reinterpret_cast
Assignment between Class Objects

Wednesday, November 23, 2011

getline and istream::getline

In C++, these 2 functions both read a string of characters from an input stream. I didn't pay attention to their differences, and in this post I summarize about that.

First, let's look at the function signatures.

istream& getline( istream& is, string &str );
istream& getline( istream& is, string &str, char delim );

istream& istream::getline( char *s, streamsize n );
istream& istream::getline( char *s, streamsize n, char delim );

For both functions, the delimiter character is delim and '\n' (newline character) is the default value. The extraction also stops if the end of file is reached or if some other error occurs during the input operation. For the function istream::getline, at most \(n-1\) characters can be read and the ending null character will be automatically appended to the C-style string after data extraction.

For both functions, if the delimiter is found, it is extracted and discarded, i.e. it is not stored and the next input operation will begin after it.

Differences:
  1. getline is a global function included in the header <string>, while istream::getline is a member function of class istream.
  2. getline reads characters into a C++ string type string, while istream::getline reads characters into a C-style string.

References:
C++ Reference: getline
C++ Reference: istream::getline

Wednesday, November 16, 2011

Maximize Payoff by Greedy Reordering

Source: CLRS, Introduction to Algorithms, 2nd Edition, Page 384, Problem 16.2-7

Problem:
Suppose you are given two sets \(A\) and \(B\), each containing \(n\) positive integers. You can choose to reorder each set however you like. After reordering, let \(a_i\) be the \(i\)th element of set \(A\), and let \(b_i\) be the \(i\)th element of set \(B\). You then receive a payoff of \(\prod_{i=1}^{n}{a_i^{b_i}}\). Given an algorithm that will maximize your payoff. Prove that your algorithm maximizes the payoff, and state its running time.

Solution:
Greedy Algorithm: Let the set \(\{a_i\}\) be sorted so that \(a_1 \ge a_2 \ge \ldots \ge a_n\) and set \(\{b_i\}\) be sorted so that \(b_1 \ge b_2 \ge \ldots \ge b_n\). Pair \(a_i\) with \(b_i\). This is the required solution.

Complexity:
If the two sets \(A\) and \(B\) are already sorted, the time complexity is \(O(n)\).
If the sets are not sorted, then sort them first and the time complexity is \(O(n\log n)\).

Proof:
Suppose the optimal payoff is not produced from the above solution. Let \(S\) be the optimal solution, in which \(a_1\) is paired with \(b_p\) and \(a_q\) is paired with \(b_1\). Note that \(a_1 \gt a_q\) and \(b_1 \gt b_p\).

Consider another solution \(S'\) in which \(a_1\) is paired with \(b_1\), \(a_q\) is paired with \(b_p\), and all other pairs are the same as \(S\). Then \[\frac{\text{Payoff}(S)}{\text{Payoff}(S')}=\frac{\prod_{S}\hspace{2 mm} a_i^{b_i}}{\prod_{S'}\hspace{2 mm}a_i^{b_i}}=\frac{(a_1)^{b_p} (a_q)^{b_1}}{(a_1)^{b_1} (a_q)^{b_p}}=(\frac{a_1}{a_q})^{b_p - b_1}\] Since \(a_1 \gt a_q\) and \(b_1 \gt b_p\), then \(\text{Payoff}(S) / \text{Payoff}(S') \lt 1\). This contradicts the assumption that \(S\) is the optimal solution. Therefore \(a_1\) should be paired with \(b_1\). Repeating the argument for remaining elements, we can get the result.

Tuesday, November 15, 2011

0-1 Knapsack Problem

Problem:
A thief robbing a store finds \(n\) items; the \(i\)th item is worth \(v_i\) dollars and weights \(w_i\) pounds, where \(v_i\) and \(w_i\) are integers. He wants to take as valuable a load as possible, but he can carry at most \(W\) pounds in his knapsack for some integer \(W\). Which items should he take?

Analysis:
There are two versions of this problem: fractional knapsack problem and 0-1 knapsack problem.
  1. Fractional knapsack problem: the items can be broken into smaller pieces, and the thief can take fractions of items.
  2. 0-1 knapsack problem: the items may not be broken into smaller pieces, and the thief must decide whether to take an item or leave it (binary choices).

For fractional knapsack problem, greedy algorithm exists and the problem can be solved in \(O(n\log n)\) time based on sorting. And a better approach may take \(O(n)\) time for the worst case based on linear-time SELECT algorithm.

For 0-1 knapsack problem, no greedy algorithm exists and we need turn to dynamic programming approach. The algorithm will take \(O(nW)\) time.

Solution:
Define \(c[i,w]\) to be the optimal value for items \(1,2,...,i\) and maximum weight \(w\). There are following recursive relations:
\[c[i,w]=\begin{cases}0, & i=0\text{ or }w=0\\c[i-1,w], & w_i\gt w\\ \max\{v_i+c[i-1,w-w_i], c[i-1,w]\}, & i>0 \text{ and }w_i \le w\end{cases}\] We can have the pseudocode:
DP_01_Knapsack(value, weight, W)
   assert value.length = weight.length
   n ← value.length
   for w ← 0 to W:
      c[0,w] ← 0
   for i ← 1 to n:
      c[i,0] ← 0
      for w ← 1 to W:
         c[i,w] ← c[i-1,w]
         if weight[i] ≤ w and value[i]+c[i-1,w-weight[i]] > c[i-1,w]:
            c[i-1,w] ← value[i]+c[i-1,w-weight[i]]
The set of items to take can be deduced from the table, starting at \(c[n,W]\) and tracing backwards where the optimal values came from. If \(c[i,w]=c[i-1,w]\), then item \(i\) is not part of the solution, and we are continue tracing with \(c[i-1,w]\). Otherwise item \(i\) is part of the solution, and we continue tracing with \(c[i-1,w-w_i]\).

References:
Dynamic Programming Solution to the 0-1 Knapsack Problem

Monday, November 14, 2011

A More Efficient Algorithm for Longest Increasing Subsequence

Problem:
Given an array of numbers or a string of length \(n\), find a longest increasing subsequence.
In my previous post, I discussed the dynamic programming approach with time complexity \(O(n^2)\). For this specific problem, we can have a more efficient algorithm which is also based on dynamic programming. Please see Wikipedia for reference.

The algorithm uses only arrays and binary searching. It processes the sequence elements in order, maintaining the longest increasing subsequence found so far. Denote the sequence values as \(X[1]\), \(X[2]\), etc. Then, after processing \(X[i]\), the algorithm will have stored values in two arrays:

\(M[j]\) — stores the position \(k\) of the smallest value \(X[k]\) such that there is an increasing subsequence of length \(j\) ending at \(X[k]\) on the range \(k ≤ i\) (note we have \(j ≤ k ≤ i\) here).

\(P[k]\) — stores the position of the predecessor of \(X[k]\) in the longest increasing subsequence ending at \(X[k]\).

In addition the algorithm stores a variable \(L\) representing the length of the longest increasing subsequence found so far.

Note that, at any point in the algorithm, the sequence \[X[M[1]], X[M[2]], ..., X[M[L]]\] is nondecreasing. For, if there is an increasing subsequence of length \(i\) ending at \(X[M[i]]\), then there is also a subsequence of length \(i-1\) ending at a smaller value: namely the one ending at \(X[P[M[i]]]\). Thus, we may do binary searches in this sequence in logarithmic time.

The algorithm, then, proceeds as follows.
L = 0
for i = 1, 2, ... n:
   binary search for the largest positive j ≤ L 
     such that X[M[j]] < X[i] (or set j = 0 if no such value exists)
   P[i] = M[j]
   if j == L or X[i] < X[M[j+1]]:
      M[j+1] = i
      L = max(L, j+1)
The result of this is the length of the longest sequence in \(L\). The actual longest sequence can be found by backtracking through the \(P\) array: the last item of the longest sequence is in \(X[M[L]]\), the second-to-last item is in \(X[P[M[L]]]\), etc. Thus, the sequence has the form \[..., X[P[P[M[L]]]], X[P[M[L]]], X[M[L]].\]
Because the algorithm performs a single binary search per sequence element, its total time can be expressed using Big \(O\) notation as \(O(n\log n)\).


References:
Wikipedia item: longest increasing subsequence

Related Posts:
\(O(n^2)\) Algorithm for Longest Increasing Subsequence

Sunday, November 13, 2011

Longest Increasing Subsequence

Problem:
Given an array of numbers or a string of length \(n\), find a longest increasing subsequence.

Brute-force approach:
Enumerate all possible subsequences and then determine which are increasing and store the longest one. This approach is obviously impractical for large \(n\) and its time complexity is exponential.

Dynamic programming approach:
The longest increasing subsequence of an array \(A\) of length \(n\) can be determined based on that of all the prefix arrays. We need use an auxiliary array \(T[n]\), in which \(T[i]\) stores the length of longest increasing (non-decreasing) subsequence ending at \(A[i]\). Notice that is ending at, i.e., the last element must be \(A[i]\). Meanwhile, the global maximum length is also stored in variable \(L\). The time complexity for this algorithm will be \(O(n^2)\).



Related Posts:
\(O(n\log n)\) Algorithm for Longest Increasing Subsequence

Wednesday, November 9, 2011

Implementation of prev_permutation

In my previous posts, I discussed about various approaches to find the next number in order. The main idea for that is the function of next_permutation in C++, or equivalent implementations in C or Java. Now, I am trying to implement the opposite function prev_permutation.

The function takes a string as input, update the string to the previous one in lexicographical order if there is, and return a bool value. If there is a smaller string, return true; if not, return false and update the string from the smallest to the greatest.

Idea:
  1. Search back from the end, find the first greater character compared to the one immediately after it.
  2. Search back from the end again, find the first smaller one compared to the great character just found.
  3. Swap the 2 characters found from above 2 steps.
  4. Reverse the rest string after the character location found in Step 1.

Tuesday, November 8, 2011

Variable Length Argument List

Sometimes we need pass along an unknown number of arguments before actually calling the function, such as searching for the minimum of maximum of a list of numbers, or calculating their average. Most programming language are able to first construct an array or list and then pass the constructed object as a single argument. But we can also pass an undefined length of arguments, and this post just summarizes this approach. Different languages provide different ways to deal with this case, and here I am discussing C/C++, Java and Python.

In C/C++, we need the type va_list and macros va_start, va_arg, and va_end. All these are included in the header file <stdarg.h> or <cstdarg>. Below is an example.

In Java, however, we directly append an ellipse after the type name, and then we can give a variable name (which is actually is an array and can be iterated via enhanced for loop in Java).

In Python, programmers have more power than passing an array of arguments: it allows a dictionary of arguments, by which we actually passing both a set of identifier names and their corresponding values. This offers more options to use the variable length arguments.

In summary,
  1. The variable length parameters are always placed at the end of parameter list of the function.
  2. In C/C++, at least one parameter must be passed and declared in the list. There is no such requirement in Java or Python. Also, in C/C++, the comma or space is not required after the last given parameter.
  3. In C/C++, the variable arguments are able to be converted to different types via va_arg, and also this conversion is required for each argument. However, Java will first construct an array of the same type for all the passed arguments. Of course, you can cast the elements if necessary.
  4. In Python, the arguments in calling function may become a little bit complex. The general order is that non-keyword arguments should be placed before keyword arguments. Also, there may be explicitly defined or predefined tuple or dict that will be passed as variable arguments, and in general these are always placed after "in-place" arguments. Here is an example.
    def: foo(arg1, arg2, *nonkws, **kws):
        #### function body
    
    pre_Tuple = (pre_nonkw1, pre_nonkw2)
    pre_Dict = {'pre_kw1':pre_v1, 'pre_kw2':pre_v2}
    
    # call function foo()
    foo(val1, val2, nkw1, nkw2, kw1=v1, kw2=v2, *(expl_nkw1, expl_nkw2), **{'expl_kw1':expl_v1, 'expl_kw2':expl_v2})
    or
    foo(val1, val2, nkw1, nkw2, kw1=v1, kw2=v2, *pre_Tuple, **pre_Dict)

    The only possibility to reorder the argument list is
    foo(val1, val2, nkw1, nkw2, *(expl_nkw1, expl_nkw2), kw1=v1, kw2=v2, **{'expl_kw1':expl_v1, 'expl_kw2':expl_v2})
    or
    foo(val1, val2, nkw1, nkw2, *pre_Tuple, kw1=v1, kw2=v2, **pre_Dict)

    Explicitly defined tuple or dict cannot co-exist with predefined tuple or dict. Only one for each is possible.

Monday, November 7, 2011

Class and Struct

In C, there is no such special thing called class, and class is just an ordinary identifier. For example,
int class=2012;
The struct in C contains data members only, and all of them are visible in the same scope as struct itself.

In Java, there is no such thing called struct, but we can use class to implement struct. The class may contain member functions besides data fields, and programmer is able to declare the visibility for each member.

Interesting while kind of confusing, C++ has them both: class and struct. In many ways, C++ compilers treat them the same. Below are some short notes about class and struct in C++.

  1. Both class and struct can have member data and member functions, and are able to specify their visibilities.
  2. Both class and struct can inherit from and be inherited by another class or struct. To state more clearly, class can inherit from either class or a struct, and similarly, struct can inherit from either class or struct.
  3. Both class and struct may contain pointers to the same class/struct type, but can't contain objects or references of the same class/struct type.
  4. Both keywords class and struct can be omitted when declaring objects. While in C, the keywords struct, as well as union and enum, can not be omitted when declaring unless a typedef was used or anonymous variable is being declared.
  5. C++ compilers don't distinguish the keywords class and struct in terms of type name. In other words, you may not define a class with name A and then define a struct with name A. The compilers will report "redefinition error". This is also true for struct, union and enum in C, although the keywords must be included for declaration.
  6. The members of a class are private by default, and the members of a struct are public by default.
  7. The inheritance for a class is also private by default, and the inheritance for a struct is public by default.
  8. In C and C++, all definitions for class, struct, union and enum must end with a semicolon ;, while in Java, no such ; outside the closing brace for class and enum definitions. (There is no struct or union in Java.)

Sunday, November 6, 2011

Dynamic Memory Allocation with new and delete

Some notes about the usage of keywords new and delete.

Syntax:
type *p = new type;     // undefined initial value for user-defined type
type *p = new type();   // default initial value, 0 for user-defined type
type *p = new type(initializer);
type *p = new type[size];          // default constructor
type *p = new type[size]();        // also default constructor
type *p = new (nothrow) type;      // don't throw bad_alloc exception
type *p = new (nothrow) type();
type *p = new (nothrow) type(initializer);
type *p = new (nothrow) type[size];
type *p = new (nothrow) type[size]();
delete p;
delete[] p;
Notes:
  1. Initializers cannot be specified for arrays created with new. All elements of an array will be initialized with the default constructors. If the type doesn't have a default constructor, this will be a compile-error.
  2. The behavior when operator new fails is compiler-specific. Most compilers will throw a std::bad_alloc exception. Using nothrow as above will return 0 or NULL and continue the program when new fails. Another option is to use function set_new_handler to handle new failures. The header file <new> is needed for the use of bad_alloc, nothrow, and set_new_hanlder.
  3. The statement delete NULL; is allowed and no error will be reported.
  4. In order to avoid double-free problem, assign the pointer to NULL after deletion.
    delete p;
    p = NULL;
    or
    delete[] p;
    p = NULL;
  5. It is not possible to directly reallocate memory allocated with new[ ]. To extend or reduce the size of a block, one must allocate a new block of adequate size, copy over the old memory, and delete the old block.
  6. Arrays allocated with new[ ] must be deallocated with delete[ ], not delete. Since the layout of arrays allocated with new[ ] is implementation defined, and possible not compatible with new. Some implementations of new[ ] embed the number of allocated objects first into the beginning of the allocated memory chunk, and return pointer to the remaining part of the array.

Related Posts:
Array Declaration and Initialization
Object Initialization

Saturday, November 5, 2011

Delete Duplicate Nodes from Linked List

Problem:
Given an unsorted linked list, delete duplicate nodes from the linked list.

Thinking:
This problem is again similar to previous 2 posts (determine uniqueness and remove duplicates). There are several algorithms available.

Node definition:
Suppose the node in this linked list is defined as the class below.

Solution 1:
Without using extra buffer, delete every duplicates for each node encountered. Time complexity \(O(n^2)\).

Solution 2:
Without using extra buffer, each time encountering a node, detect if it's a duplicate. If yes, then delete it. Time complexity \(O(n^2)\).

Solution 3:
Using an extra hash table / hash set / hash map to keep track which nodes have already existed. Average time complexity \(O(n)\).

Notes:
  1. Would there be any differences if the linked list is doubly linked?
    ==> Probably not.
  2. How about sorting the list first?
    ==> As long as you are allowed to destroy the original order, this should work.

Friday, November 4, 2011

Remove Duplicate Characters

Problem:
Given a string, remove the duplicate characters in the string.

Assumption:
All the characters are from ASCII set and in the range 0~255. Also, the character null character does not appear in the string.

Requirement:
An extra copy of the array is not allowed.

Similar to last post about approaches to deciding unique characters or not, this can be done in place or with an array of fixed size.

Note:
Although C++ algorithm library has unique function, which may seem appropriate for this problem. However, that unique function just removes the consecutive duplicates.

We may also sort the string first if we can remove any duplicates, not necessarily keeping the first one in appearance order. In this way, we may get to time complexity \(O(n\log n)\). Also, this method would destroy the order of unique characters. This solution is not shown in this post.

Solution 1:
For each character in the string and unique, detect all duplicates in the string afterwards. Mark those duplicates and remove them in the end. Suppose the string is given as a char array, otherwise we may need convert a String to StringBuffer, StringBuilder, or CharArray.

Solution 2:
For each character in the string, first determine if it's duplicate. If yes, simply skip it; if not, keep it in the string.

Solution 3:
An extra array of fixed size can be used to record which characters have appeared. This can reduce the time complexity from \(O(n^2)\) to \(O(n)\). Also, we can have boolean array or bitset approaches. Each of them can be applied to the above 2 schemes and we can have 4 different algorithms. Here, I am showing boolean array with Scheme 1 in this solution, and bitset with Scheme 2 for next solution.

Solution 4:
Use bitset with the above Scheme 2 to remove the duplicates.

Thursday, November 3, 2011

String Contains All Unique Characters or Not

Problem:
Given a string, determine if this string contains all unique characters.

Assumption:
The characters in this string are from ASCII character set. In other words, the value range is 0~255.

Solution 1:
Directly compare each character in the string with every other character in the string. This approach will take \(O(n^2)\) time and no space.

Solution 2:
If allowed to destroy the string, we may first sort the string using quicksort, and then scan through the string to see if there are 2 adjacent characters are the same. Quicksort will take \(O(n\log n)\) time and \(O(1)\) space. If using mergesort, it will take \(O(n)\) space. In Java, to sort the string, an extra array or StringBuffer is needed.

Solution 3:
Since the characters are in a small range, and we can create an array to record if one certain character has appeared or not. This approach will take \(O(n)\) time and \(O(m)\) space, where \(m\) is the size of character set.

Solution 4:
Similar to Solution 3, but instead of using an array of boolean values, use a set of bits to record the binary values. In Java, \(256/32=8\) int size bits are needed, which can be implemented as an array of 8 integers, and each integer represents 32 characters. In C++, there is a very convenient structure called bitset, which can be beautifully used here.

Notes:
Since the characters are all from the assumed set, and the set size is known. In this case, the maximum length for a string of all unique characters will be the size of the character set. Therefore, for each of the above programs, we may add the following code.
if( s.length() > 256 ) return false;

Wednesday, November 2, 2011

TreeSet vs HashSet vs LinkedHashSet

The post here nicely summarized the collections TreeSet, HashSet, and LinkedHashSet in Java. I mostly repeat that.

TreeSetHashSetLinkedHashSet
public class TreeSet
extends AbstractSet
implements SortedSet, Cloneable, Serializable
public class HashSet
extends AbstractSet
implements Set, Cloneable, Serializable
public class LinkedHashSet
extends HashSet
implements Set, Cloneable, Serializable
unique valuesunique valuesunique values
red-black treehash tablehash table with double links
ascending orderundefined orderinsertion order
\(O(\log n)\) for add, remove and contains\(O(1)\)\(O(1)\), a littler slower than HashSet
except for the operation of iteration

For more details, please see the reference blog.

References:
Vidya's Blog

Tuesday, November 1, 2011

Size of Class Objects

Some notes about how to determine the size of class objects in C++.

  1. The keyword sizeof can work on both class type names and class objects.
  2. Static member data don't contribute to the size of class (they are classwide available)
  3. Member functions don't really contribute to the size of class (except the vtable for virtual functions).
  4. The this pointer doesn't contribute to the size of class objects.
  5. Friends and pointers to members are not class members at all, and hence they won't contribute to the class size.
  6. It is unspecified whether or not a reference requires storage. It may be compiler specific.
  7. The actual size is most likely greater than the sum of sizes for each non-static member data and virtual pointers. This is because of byte alignment or padding, and depends on the compiler.
  8. Virtual pointer will take some space if there are virtual functions or virtual inheritance in the class.
  9. Empty classes won't be size 0 in order to distinguish the objects of that class type. By most compilers, their size will be 1.
  10. Empty sub-objects will take no space in memory when an empty class is inherited by a non-empty class type.
  11. Regular inheritance will make the size of derived class the sum of all non-static member sizes from its base classes.
  12. The _vptr field is always an alias of the first available _vptr of its base classes, if there are some. Otherwise, the class itself will create a new _vptr if necessary.
  13. The virtual inheritance is kind of complex. It's designed to solve the diamond problem in inheritance, and there will be only 1 copy for the common ancestor members. Simply speaking, the class will first create the non-virtual base class members, and its own members in class body, and finally a unique set of members from virtual base classes (including immediate virtual base and inherited virtual base).

References:
C Programming
C++ FAQ
Wikipedia: Reference
Wikipedia: Virtual Inheritance

Monday, October 31, 2011

The Keyword typedef

Some usages for the keyword typedef.

  1. Introduce a synonym for fundamental types.
    typedef int Integer;
  2. Introduce a synonym for pointer types.
    typedef char *pChar;
  3. Introduce a synonym for reference types.
    typedef double &refDouble;
  4. Introduce a synonym for struct, union, enum in C.
    typedef struct {int key; NODE *next;} NODE;
  5. Introduce a synonym for struct or class in C++.
    typedef class Old New;
    typedef Old New;
  6. Introduce a synonym for array types.
    typedef double dArray[10];
  7. Introduce a synonym for function types.
    typedef void Func(int, double);
  8. Introduce a synonym for function pointer types.
    typedef (void *)FuncPtr(int, double);

Pointers to Members in C++

Rarely used. Just keep a memo.

Example:
class A {
public:
    int x;
    char *s;
    void f( float );
};

int main()
{
    int A::*xp = &A::x;
    char *A::*sp = &A::s;
    void (A::*fp)(float) = &A::f;

    A a;
    A *ap = &a;

    a.*xp ...
    a.*sp ...
    (a.*)fp( 0.0 ) ...

    ap->*xp ...
    ap->*sp ...
    (ap->*)fp( 0.0 ) ...

    ...
    return 0;
}

Notes:
  1. Pointers to members cannot be pointed to static members.
  2. The address stored in pointers to members are offset to the class object address.
  3. Pointers to members must be used with class objects.
  4. Pointers to member functions cannot be used to show function address. Used to call function only.

References:
IBM publib

Friend Functions and Classes in C++

Some notes for friendship in C++.

  1. Friends of a class can access any member of the class, either public, protected, or private.

  2. Friends of a class can be declared as different scopes.
    1. Global functions: any object or function that is able to call the global function may get access to the class contents.
    2. Global classes: all member functions of such global class may have access to the protected and private fields of the class declaring friend.
    3. Members of other class: this will specify which member function or class of the "friend" class can have access to the class contents.

  3. Friendship is not inherited, reciprocal, or transitive.
    1. If class A is friend of B and C is derived from B, A may not be friend of C unless explicitly declared.
    2. If class A is friend of B and C is derived from A, C may not be friend of B unless explicitly declared.
    3. If class A is friend of B, B may not be friend of A unless explicitly declared.
    4. If class A is friend of B and B is friend of C, A may not be friend of C unless explicitly declared.

  4. Friends are not member of the class.
    1. The declaration of friends can be put anywhere, either public, protected, or private.
    2. The this pointer is not available in friends.

  5. A friend function or class can be the friend of multiple classes, hence they can be used for message passing between classes.

  6. The definition of friends can be placed in the body of class who declares the friendship. Even in this case, the friend functions or classes are in the global scope, not members of the class.

  7. Do friends violate encapsulation?
    No! If they're used properly, they enhance encapsulation. Many people think of a friend function as something outside the class. Instead, try thinking of a friend function as part of the class's public interface. A friend function in the class declaration doesn't violate encapsulation any more than a public member function violates encapsulation: both have exactly the same authority with respect to accessing the class's non-public parts.

References:
Coding Unit Tutorials
C++ FAQ

Operator Overloading in C++

Just wrap-up the notes for operator overloading in C++.

  1. Almost all operators can be overloaded in user-defined classes, except the following four.
    .   .*   ::   ?:
  2. Some operator overloading can be defined as class member function only.
    ()   []   ->   "any assignment operators"   "type conversion"
    Question: how about ->*?

  3. In some cases, global functions are needed to implement operator overloads.
    1. When using stream operator on a user-defined class, it may be impossible to overload the cout's member operator functions for a general programmer. In this case, global functions is needed.
    2. When defining some operations requiring commutability, the global function approach is necessary, since class member functions always take the object itself as the first implicit argument.

  4. Prefix increment and decrement overloads are exactly the same as any other unary operators. For postfix increment and decrement, the compiler will generate a function call operator++ with one int argument 0 (or besides the class object parameter in global functions).

  5. Type casting operations are possible using operator overloading. For example,
    A::operator int() const;
    A::operator B() const;   // class B should be defined/declared before class A
    Note that the return type may not be specified on a type casting function.

Sunday, October 30, 2011

Type Casting in C++

There are several ways to cast one type to another in C++. I'd like to categorize them as implicit conversion, explicit conversion, and keyword conversion (which is also explicit, and I separate it for discussion).

Implicit Conversion

As in C, some casting can be automatically done for some compatible types.
int a = 100;
double b;
b = a;
Also, such implicit conversion occurs when initialize or assign to class objects, calling the overloaded "=" operator or constructor.
class A {};
class B { public: B(A a){} };
A a;
B b = a;

Explicit Conversion
Like in C, we can use parentheses to cast types. The ways for explicit casting include:
(new_type) expression
new_type (expression)
(new_type) (expression)
Some notes for this:
  1. At least one pair of parentheses is required.
  2. If the new type is pointer, reference, or contains multiple words (like const char), the parentheses for new_type is required.
  3. If the expression is not a single variable or number (like a+b, the parenthesis for expression is required.
  4. This explicit casting applies to almost any types (including class types with explicit constructor or operator function), but there may be errors in run time.

Keyword Conversion
Four keywords are available for casting: static_cast, dynamic_cast, reinterpret_cast, const_cast. The usage for keyword conversion is
static_cast<new_type>(expression)
dynamic_cast<new_type>(expression)
reinterpret_cast<new_type>(expression)
const_cast<new_type>(expression)
Note: the parentheses for expression are always required, even when the expression is a single variable or number.

static_cast
  1. It can be used to cast between fundamental types, as well as class types for those with explicit constructors or operator functions.
  2. It can perform casting between pointers or references to related classes. The compiler won't complain if casting base class pointer to derived class pointer. The overhead of type-safe check by dynamic_cast can be avoided. It's the programmers' responsibility to ensure safe conversions.
dynamic_cast
  1. It can be used only with pointers or references to objects. Its purpose is to ensure that the result of the type conversion is a valid complete object of the requested class.
  2. Dynamic casting from derived class to base class is always successful, while from base class to derived class, successful only when the base class is polymorphic.
  3. It requires the Run-Time Type Information (RTTI) to keep track of dynamic types. Some compilers support this feature as an option which is disabled by default. This must be enabled for runtime type checking using dynamic_cast to work properly.
reinterpret_cast
  1. It converts any pointer type to any other pointer type, even of unrelated classes.
  2. All pointer conversions are allowed: neither the content pointed nor the pointer type itself is checked, and thus dereferencing it is unsafe..
  3. It can also cast pointers to or from integer types.
const_cast
  1. It manipulates the constness or volatileness of an object, either to be set or to be removed.

References
C++ Tutorial

Related Posts:
Type Casting on Classes
static_cast, dynamic_cast, and reinterpret_cast
Assignment between Class Objects

Saturday, October 29, 2011

Concatenate and Split Characters in Excel

One more post for Excel: combine or split characters.

Combine:
Suppose you want to combine the texts in cells A1, B1, C1 into cell D1.
===
In the target cell D1, type =A1&B1&C1. Enter and done. Pretty much like the overloaded "+" operation in many OOP programming languages.

Split:
Suppose in cell C3, we have text "12345", and we want to separate each character into one cell. For example, we can put "1" in cell D3, "2" in cell E3, and so on.
===
In cell D3, type =MID($\$$C3, (COLUMN(D3)-COLUMN($\$$C3)), 1), and press Enter.
Then use autofill to complete the rest of this row.

Select Every \(n^{th}\) Row in Excel

Today, I processed some Excel file and tried to select other row in a spreadsheet. Again, I am discussing the way with Excel, not programmatic.

Actually, my problem can be generalized to select every \(n^{th}\) row in the spreadsheet. The approach is based on Excel filter and described as below.

Excel version: 2010
Suppose the data area is C10:E20, and I am going to select every 3rd row.
  1. In a separate area from the data matrix to be selected, locate a cell at the same row as the data area. Pick A10 in my example.
  2. Type in the cell A10 =MOD((ROW(A10)-ROW(A$9)), 3), and press Enter.
  3. Using autofill feature to complete all rows 10 to 20.
  4. Locate Cell A10, click Filter under Data menu.
  5. Select 0 only from the drop-down list.
  6. Copy the filtered data into another separate area or a new sheet.
  7. If copied in the same sheet, clear Filter to see complete data.

Friday, October 28, 2011

Function Pointers in C++

Just discovered some interesting facts about function pointers in C++. Here I am giving a short summary.

Different compilers may treat function pointer operations in different ways. For the output operation cout <<, MSVC10 can print the actual address for general function pointers, but g++ will print a bool value 1. However, if casted to void * type, both MSVC and g++ can display the actual address.

For class member functions, MSVC works the same way as g++, and both would output a bool value without casting. One more point is that C++ standard does not allow class member pointers be casted to void * type. Therefore, we have no way to get the real address unless we use the printf function.

Here are a sample program and its outputs by MSVC and g++.

Output by MSVC:
Output by g++:

======================================
Some more thinking about this 2 days after the original posting. I just found another special case in which both MSVC and g++ work exactly the same way for function pointers. Also, this may explain why g++ prefers to output a bool value for general function pointers in the statement cout << fp.

The special case is related to stream manipulator! Actually, stream manipulators are function pointers with the prototype ostream & manipulator( ostream & ). Seems that C++ library has already overloaded the case cout.operator<<( ostream & func(ostream &) ) and it will actually execute func( ostream & ) and returns ostream &. For all other function names or function pointers, it will return a bool value.

Class Templates in C++

In one of my previous posts, I made some short notes about class and function nesting. Now, I'd like to discuss more about class templates in C++.

First, we need be able to tell the difference between class template and template class.

Class template: is a template used to generate template classes. You cannot declare an object of a class template.
Template class: is an instance of a class template.

The class template definition is preceded by
    template< template-parameter-list >
where template-parameter-list is a comma-separated list of one or more of the following kinds of template parameters:
    type, non-type, template
Here is an example for template type for template parameter:
    template<template <class T> class X> class A { };
    template<class T> class B { };
    A<b> a;
Multiple template declarations may exist and all template declarations for a class template must have the same types and number of template arguments. Only one template declaration containing the class definition is allowed.

When nested template argument lists are needed, a separating space between the > at the end of the inner list and the > at the end of the outer list is necessary. Otherwise, there is an ambiguity between the extraction operator >> and two template list delimiters >. This is an example:
     map< int, vector<int> >
     map<int, vector<int>>      // error
I double checked this and found that it is an error with g++ but okay with MSVC 2010.
     map<int, vector<int>>      // okay with MSVC 2010
Templates can be defined within ordinary classes or class templates, in which case they are referred to as member templates. Member templates that are classes are referred to as nested class templates.

Nested class templates are declared as class templates inside the scope of the outer class. They can be defined inside or outside of the enclosing class.

When nested class templates are defined outside of their enclosing class, they must be prefaced by the template parameters for both the class template (if they are members of a class template) and template parameters for the member template.

One template parameter list is required for each level, and cannot combine multiple lists.
template<outer-template-parameter-list>
template<inner-template-parameter-list>
return_type Outer<..>::Inner<..>::func(..) {}

// below is illegal
template<outer-template-parameter-list, inner-template-parameter-list>
return_type Outer<..>::Inner<..>::func(..) {}

// if outer class is ordinary class, then only one template list is possible
template<inner-template-parameter-list>
return_type Outer::Inner<..>::func(..) {} 

It is not allowed to specialize template members of a template class without specializing the outside class.

Update:
Local classes are not allowed to have member templates.

References:
IBM publib
Microsoft MSDN

Related Posts:
Nested Classes and Functions
Function Template Specialization

Thursday, October 27, 2011

The Number of Inversions in an Array

Problem:
Let \(A[1...n]\) be an array of \(n\) distinct numbers. If \(i \lt j\) and \(A[i] \gt A[j]\), then the pair \( (i,j) \) is called an inversion of A. Determine the number of inversions given an array of distinct numbers.

Solution 1:
Naive approach. There are \( \frac{n(n-1)}{2} \) number pairs in an array of size \(n\), and check one by one. The time complexity is \(O(n^2)\).

We can have a better solution for this problem, which is proposed as below.

Solution 2:
Divide-and-conquer. This strategy works in similar way to merge-sort, and bears time complexity \(n\log n\).

Various Implementations of pow(x, y)

In this post, I discuss the implementation of function
double pow( double, int ).

Let's first consider the most simple case: y is a positive integer.
The naive and straightforward approach is working with a loop like following.

Next, we need consider all possible integers for y.

We can write a recursive function to calculate x raised to the power y.

The time complexities for all above 3 implementations are \(O(n)\). Use of a temporary variable would help.

Now, the time complexity reduces to \(O(\log n)\). How about non-recursive version? Yes, we can do it.

Actually, we can have one more implementation with bit operations!! Sounds amazing!

Syntax of Function pow(x, y)

This post just summarizes the syntax of power function pow(x,y) in C and C++.

ISO/ANSI C syntax:
double pow( double, double );

C++98 Functions:
float std::pow( float, float );
float std::pow( float, int );
double std::pow( double, double );
double std::pow( double, int );
long double std::pow( long double, long double );
long double std::pow( long double, int );

Under C++98 standard, the call std::pow(int, int) will be a compile error.
However, C++0X standard should be able to handle this. C++0X adds the additional overloads sufficient to ensure that:
  1. If any argument corresponding to a double parameter has type long double, then all arguments corresponding to double parameters are effectively casted to long double.
  2. Otherwise, if any argument corresponding to a double parameter has type double or an integer type, then all arguments corresponding to double parameters are effectively casted to double.
  3. Otherwise, all arguments corresponding to double parameters are effectively casted to float.


References:
http://www.daniweb.com/software-development/cpp/threads/108300

Static Member Functions in C++

In this post I make some notes about static member functions in C++.

  1. Static member functions belong to and can be accessed via the class. Also, it can be accessed via objects of the class type, but it's not recommended.
  2. Static member functions have no implicit this pointer, since the this pointer always points to the object that the member function is working on.
  3. Static member functions can only access the names of static members, enumerators, and nested types of the class in which it is declared.
  4. Even when accessing static members in static member functions, this pointer is not allowed.
  5. Static member functions can be accessed via this pointer in other nonstatic member functions.
  6. Static member functions cannot be declared with the keywords virtual, const, or volatile.
  7. Static member functions cannot have the same name as a nonstatic member function that has the same argument types. In other words, the keyword static is not part of function signature.
  8. Pure static class (has static members and static membe functions only) is dangerous like global variables.
  9. Pure static class cannot have multiple copies without cloning and renaming.
  10. Operator overloading functions of a class cannot be static member functions
  11. Pointers to member cannot be used to point to any static member, including static member functions.
  12. The left side of a member-selection operator (. or ->) that selects a static member function is not evaluated. For example, the expression SideEffects().CountOf() does not call the function SideEffect.

I checked the last item and found some conflicting results. Here is an example (actually counter-example) for that item. In my example, the left side function call was actually executed, either passing arguments or not.

The output is:
0
in increaA
1
in increaA
2
in decreaA
1
in decreaA
0


References:
Microsoft MSDN library
IBM publib
LearnCpp.com

Swap 2 Variables

This problem can be seen everywhere, and almost every people knows the solution. In this post, I'm going to summarize 3 functions for swap.

Function 1: naiveSwap

Function 2: xorSwap

Function 3: addSwap

Summary:
naiveSwapxorSwapaddSwap
extra variableneednot neednot need
check \(x \ne y\) (address)not needneedneed
readabilitygoodnot goodnot good
overflowno problemno problempotential problem
aliasingfinecomplicatedcomplicated
execution modelinstruction pipelinestrictly sequentialstrictly sequential


Situations for XOR use in practice:
  • On a processor where the instruction set encoding permits the XOR swap to be encoded in a smaller number of bytes;
  • In a region with high register pressure, it may allow the register allocator to avoid spilling a register;
  • In Microcontrollers where available RAM is very limited.
Because these situations are rare, most optimizing compilers don't generate XOR swap code.

Because of the problem of overflow, the add-swap approach is used less often.


References:
Wikipedia item: XOR swap algorithm

Wednesday, October 26, 2011

Next Number in Order

Problem:
Given a number as string, output the next number in order. The digits you may use are only those already given, plus any number of 0's.

Example:
Given 1234, output 1243;
Given 4321, output 10234.

Solution:
In previous post, I discussed the ways using C++ library function next_permutation. What if we are not allowed to use this function? Can we write de novo code?

First, I tried Java. Since string in Java cannot be updated in place, and hence it needs many auxiliary variables, including StringBuffer objects that can be set at a specific index.

Here is the Java code.

Secondly, I tried to implement this function in C. Suppose the string given was stored in a char array, which is the C style.

Here is the C code.

Related Posts:
Next Number in Order with next_permutation

Highlight Section in Blog

Method:
Add corresponding CSS style into head section. Name classes meaningful.

Here is a simple example ...
======================================================

Hello, World!
I just want to highlight this section. I just want to highlight this section. I just want to highlight this section. I just want to highlight this section.

Also, I want to highlight this word only. Also, I want to highlight this word only. Also, I want to highlight this word only.

This is a table aligned to left:
112345
123451

This is a table aligned to center:
112345
123451

Goodbye ...

Monday, October 24, 2011

Next Number in Order

Problem:
Given a number as string, output the next number in order. The digits you may use are only those already given, plus any number of 0's.

Example:
Given 1234, output 1243;
Given 4321, output 10234.

Solution:
The best approach is to make use of the C++ next_permutation template function from <algorithm> library. This template can work on a string and update it in place to the next string in lexicographical order. If the given string is already the largest one, then the function returns false and changes the string to the smallest one; otherwise, returns true and modifies the string to next one in order.

Code 1: first call next_permutation, then insert a zero if false returned.

Code 2: first insert a 0 to the beginning of the string, after permutation, remove it if it's still at the beginning.

It's better to look into the magic next_permutation template. When not allowed to use such library function, we may need write code similar to that.

Related Posts:
Next Number in Order without next_permutation

Nonrecursive Algorithm to Inorder Traverse Binary Tree

In previous post, I discussed the nonrecursive algorithm with stack to inorder traverse a binary tree. That's also pretty straightforward. In fact, we can implement the inorder traversal of binary trees without stack. This is more complicated but elegant.

For this approach, I use one more pointer to record the subtree already visited. Also, we need keep in mind that we are doing INORDER traversal. Therefore, if the right subtree of a node has been visited, the node itself must have been visited.


Related Posts:
Nonrecursive Algorithm with Stack to Inorder Traverse Binary Tree