Monday, October 31, 2011

The Keyword typedef

Some usages for the keyword typedef.

  1. Introduce a synonym for fundamental types.
    typedef int Integer;
  2. Introduce a synonym for pointer types.
    typedef char *pChar;
  3. Introduce a synonym for reference types.
    typedef double &refDouble;
  4. Introduce a synonym for struct, union, enum in C.
    typedef struct {int key; NODE *next;} NODE;
  5. Introduce a synonym for struct or class in C++.
    typedef class Old New;
    typedef Old New;
  6. Introduce a synonym for array types.
    typedef double dArray[10];
  7. Introduce a synonym for function types.
    typedef void Func(int, double);
  8. Introduce a synonym for function pointer types.
    typedef (void *)FuncPtr(int, double);

Pointers to Members in C++

Rarely used. Just keep a memo.

Example:
class A {
public:
    int x;
    char *s;
    void f( float );
};

int main()
{
    int A::*xp = &A::x;
    char *A::*sp = &A::s;
    void (A::*fp)(float) = &A::f;

    A a;
    A *ap = &a;

    a.*xp ...
    a.*sp ...
    (a.*)fp( 0.0 ) ...

    ap->*xp ...
    ap->*sp ...
    (ap->*)fp( 0.0 ) ...

    ...
    return 0;
}

Notes:
  1. Pointers to members cannot be pointed to static members.
  2. The address stored in pointers to members are offset to the class object address.
  3. Pointers to members must be used with class objects.
  4. Pointers to member functions cannot be used to show function address. Used to call function only.

References:
IBM publib

Friend Functions and Classes in C++

Some notes for friendship in C++.

  1. Friends of a class can access any member of the class, either public, protected, or private.

  2. Friends of a class can be declared as different scopes.
    1. Global functions: any object or function that is able to call the global function may get access to the class contents.
    2. Global classes: all member functions of such global class may have access to the protected and private fields of the class declaring friend.
    3. Members of other class: this will specify which member function or class of the "friend" class can have access to the class contents.

  3. Friendship is not inherited, reciprocal, or transitive.
    1. If class A is friend of B and C is derived from B, A may not be friend of C unless explicitly declared.
    2. If class A is friend of B and C is derived from A, C may not be friend of B unless explicitly declared.
    3. If class A is friend of B, B may not be friend of A unless explicitly declared.
    4. If class A is friend of B and B is friend of C, A may not be friend of C unless explicitly declared.

  4. Friends are not member of the class.
    1. The declaration of friends can be put anywhere, either public, protected, or private.
    2. The this pointer is not available in friends.

  5. A friend function or class can be the friend of multiple classes, hence they can be used for message passing between classes.

  6. The definition of friends can be placed in the body of class who declares the friendship. Even in this case, the friend functions or classes are in the global scope, not members of the class.

  7. Do friends violate encapsulation?
    No! If they're used properly, they enhance encapsulation. Many people think of a friend function as something outside the class. Instead, try thinking of a friend function as part of the class's public interface. A friend function in the class declaration doesn't violate encapsulation any more than a public member function violates encapsulation: both have exactly the same authority with respect to accessing the class's non-public parts.

References:
Coding Unit Tutorials
C++ FAQ

Operator Overloading in C++

Just wrap-up the notes for operator overloading in C++.

  1. Almost all operators can be overloaded in user-defined classes, except the following four.
    .   .*   ::   ?:
  2. Some operator overloading can be defined as class member function only.
    ()   []   ->   "any assignment operators"   "type conversion"
    Question: how about ->*?

  3. In some cases, global functions are needed to implement operator overloads.
    1. When using stream operator on a user-defined class, it may be impossible to overload the cout's member operator functions for a general programmer. In this case, global functions is needed.
    2. When defining some operations requiring commutability, the global function approach is necessary, since class member functions always take the object itself as the first implicit argument.

  4. Prefix increment and decrement overloads are exactly the same as any other unary operators. For postfix increment and decrement, the compiler will generate a function call operator++ with one int argument 0 (or besides the class object parameter in global functions).

  5. Type casting operations are possible using operator overloading. For example,
    A::operator int() const;
    A::operator B() const;   // class B should be defined/declared before class A
    Note that the return type may not be specified on a type casting function.

Sunday, October 30, 2011

Type Casting in C++

There are several ways to cast one type to another in C++. I'd like to categorize them as implicit conversion, explicit conversion, and keyword conversion (which is also explicit, and I separate it for discussion).

Implicit Conversion

As in C, some casting can be automatically done for some compatible types.
int a = 100;
double b;
b = a;
Also, such implicit conversion occurs when initialize or assign to class objects, calling the overloaded "=" operator or constructor.
class A {};
class B { public: B(A a){} };
A a;
B b = a;

Explicit Conversion
Like in C, we can use parentheses to cast types. The ways for explicit casting include:
(new_type) expression
new_type (expression)
(new_type) (expression)
Some notes for this:
  1. At least one pair of parentheses is required.
  2. If the new type is pointer, reference, or contains multiple words (like const char), the parentheses for new_type is required.
  3. If the expression is not a single variable or number (like a+b, the parenthesis for expression is required.
  4. This explicit casting applies to almost any types (including class types with explicit constructor or operator function), but there may be errors in run time.

Keyword Conversion
Four keywords are available for casting: static_cast, dynamic_cast, reinterpret_cast, const_cast. The usage for keyword conversion is
static_cast<new_type>(expression)
dynamic_cast<new_type>(expression)
reinterpret_cast<new_type>(expression)
const_cast<new_type>(expression)
Note: the parentheses for expression are always required, even when the expression is a single variable or number.

static_cast
  1. It can be used to cast between fundamental types, as well as class types for those with explicit constructors or operator functions.
  2. It can perform casting between pointers or references to related classes. The compiler won't complain if casting base class pointer to derived class pointer. The overhead of type-safe check by dynamic_cast can be avoided. It's the programmers' responsibility to ensure safe conversions.
dynamic_cast
  1. It can be used only with pointers or references to objects. Its purpose is to ensure that the result of the type conversion is a valid complete object of the requested class.
  2. Dynamic casting from derived class to base class is always successful, while from base class to derived class, successful only when the base class is polymorphic.
  3. It requires the Run-Time Type Information (RTTI) to keep track of dynamic types. Some compilers support this feature as an option which is disabled by default. This must be enabled for runtime type checking using dynamic_cast to work properly.
reinterpret_cast
  1. It converts any pointer type to any other pointer type, even of unrelated classes.
  2. All pointer conversions are allowed: neither the content pointed nor the pointer type itself is checked, and thus dereferencing it is unsafe..
  3. It can also cast pointers to or from integer types.
const_cast
  1. It manipulates the constness or volatileness of an object, either to be set or to be removed.

References
C++ Tutorial

Related Posts:
Type Casting on Classes
static_cast, dynamic_cast, and reinterpret_cast
Assignment between Class Objects

Saturday, October 29, 2011

Concatenate and Split Characters in Excel

One more post for Excel: combine or split characters.

Combine:
Suppose you want to combine the texts in cells A1, B1, C1 into cell D1.
===
In the target cell D1, type =A1&B1&C1. Enter and done. Pretty much like the overloaded "+" operation in many OOP programming languages.

Split:
Suppose in cell C3, we have text "12345", and we want to separate each character into one cell. For example, we can put "1" in cell D3, "2" in cell E3, and so on.
===
In cell D3, type =MID($\$$C3, (COLUMN(D3)-COLUMN($\$$C3)), 1), and press Enter.
Then use autofill to complete the rest of this row.

Select Every \(n^{th}\) Row in Excel

Today, I processed some Excel file and tried to select other row in a spreadsheet. Again, I am discussing the way with Excel, not programmatic.

Actually, my problem can be generalized to select every \(n^{th}\) row in the spreadsheet. The approach is based on Excel filter and described as below.

Excel version: 2010
Suppose the data area is C10:E20, and I am going to select every 3rd row.
  1. In a separate area from the data matrix to be selected, locate a cell at the same row as the data area. Pick A10 in my example.
  2. Type in the cell A10 =MOD((ROW(A10)-ROW(A$9)), 3), and press Enter.
  3. Using autofill feature to complete all rows 10 to 20.
  4. Locate Cell A10, click Filter under Data menu.
  5. Select 0 only from the drop-down list.
  6. Copy the filtered data into another separate area or a new sheet.
  7. If copied in the same sheet, clear Filter to see complete data.

Friday, October 28, 2011

Function Pointers in C++

Just discovered some interesting facts about function pointers in C++. Here I am giving a short summary.

Different compilers may treat function pointer operations in different ways. For the output operation cout <<, MSVC10 can print the actual address for general function pointers, but g++ will print a bool value 1. However, if casted to void * type, both MSVC and g++ can display the actual address.

For class member functions, MSVC works the same way as g++, and both would output a bool value without casting. One more point is that C++ standard does not allow class member pointers be casted to void * type. Therefore, we have no way to get the real address unless we use the printf function.

Here are a sample program and its outputs by MSVC and g++.

Output by MSVC:
Output by g++:

======================================
Some more thinking about this 2 days after the original posting. I just found another special case in which both MSVC and g++ work exactly the same way for function pointers. Also, this may explain why g++ prefers to output a bool value for general function pointers in the statement cout << fp.

The special case is related to stream manipulator! Actually, stream manipulators are function pointers with the prototype ostream & manipulator( ostream & ). Seems that C++ library has already overloaded the case cout.operator<<( ostream & func(ostream &) ) and it will actually execute func( ostream & ) and returns ostream &. For all other function names or function pointers, it will return a bool value.

Class Templates in C++

In one of my previous posts, I made some short notes about class and function nesting. Now, I'd like to discuss more about class templates in C++.

First, we need be able to tell the difference between class template and template class.

Class template: is a template used to generate template classes. You cannot declare an object of a class template.
Template class: is an instance of a class template.

The class template definition is preceded by
    template< template-parameter-list >
where template-parameter-list is a comma-separated list of one or more of the following kinds of template parameters:
    type, non-type, template
Here is an example for template type for template parameter:
    template<template <class T> class X> class A { };
    template<class T> class B { };
    A<b> a;
Multiple template declarations may exist and all template declarations for a class template must have the same types and number of template arguments. Only one template declaration containing the class definition is allowed.

When nested template argument lists are needed, a separating space between the > at the end of the inner list and the > at the end of the outer list is necessary. Otherwise, there is an ambiguity between the extraction operator >> and two template list delimiters >. This is an example:
     map< int, vector<int> >
     map<int, vector<int>>      // error
I double checked this and found that it is an error with g++ but okay with MSVC 2010.
     map<int, vector<int>>      // okay with MSVC 2010
Templates can be defined within ordinary classes or class templates, in which case they are referred to as member templates. Member templates that are classes are referred to as nested class templates.

Nested class templates are declared as class templates inside the scope of the outer class. They can be defined inside or outside of the enclosing class.

When nested class templates are defined outside of their enclosing class, they must be prefaced by the template parameters for both the class template (if they are members of a class template) and template parameters for the member template.

One template parameter list is required for each level, and cannot combine multiple lists.
template<outer-template-parameter-list>
template<inner-template-parameter-list>
return_type Outer<..>::Inner<..>::func(..) {}

// below is illegal
template<outer-template-parameter-list, inner-template-parameter-list>
return_type Outer<..>::Inner<..>::func(..) {}

// if outer class is ordinary class, then only one template list is possible
template<inner-template-parameter-list>
return_type Outer::Inner<..>::func(..) {} 

It is not allowed to specialize template members of a template class without specializing the outside class.

Update:
Local classes are not allowed to have member templates.

References:
IBM publib
Microsoft MSDN

Related Posts:
Nested Classes and Functions
Function Template Specialization

Thursday, October 27, 2011

The Number of Inversions in an Array

Problem:
Let \(A[1...n]\) be an array of \(n\) distinct numbers. If \(i \lt j\) and \(A[i] \gt A[j]\), then the pair \( (i,j) \) is called an inversion of A. Determine the number of inversions given an array of distinct numbers.

Solution 1:
Naive approach. There are \( \frac{n(n-1)}{2} \) number pairs in an array of size \(n\), and check one by one. The time complexity is \(O(n^2)\).

We can have a better solution for this problem, which is proposed as below.

Solution 2:
Divide-and-conquer. This strategy works in similar way to merge-sort, and bears time complexity \(n\log n\).

Various Implementations of pow(x, y)

In this post, I discuss the implementation of function
double pow( double, int ).

Let's first consider the most simple case: y is a positive integer.
The naive and straightforward approach is working with a loop like following.

Next, we need consider all possible integers for y.

We can write a recursive function to calculate x raised to the power y.

The time complexities for all above 3 implementations are \(O(n)\). Use of a temporary variable would help.

Now, the time complexity reduces to \(O(\log n)\). How about non-recursive version? Yes, we can do it.

Actually, we can have one more implementation with bit operations!! Sounds amazing!

Syntax of Function pow(x, y)

This post just summarizes the syntax of power function pow(x,y) in C and C++.

ISO/ANSI C syntax:
double pow( double, double );

C++98 Functions:
float std::pow( float, float );
float std::pow( float, int );
double std::pow( double, double );
double std::pow( double, int );
long double std::pow( long double, long double );
long double std::pow( long double, int );

Under C++98 standard, the call std::pow(int, int) will be a compile error.
However, C++0X standard should be able to handle this. C++0X adds the additional overloads sufficient to ensure that:
  1. If any argument corresponding to a double parameter has type long double, then all arguments corresponding to double parameters are effectively casted to long double.
  2. Otherwise, if any argument corresponding to a double parameter has type double or an integer type, then all arguments corresponding to double parameters are effectively casted to double.
  3. Otherwise, all arguments corresponding to double parameters are effectively casted to float.


References:
http://www.daniweb.com/software-development/cpp/threads/108300

Static Member Functions in C++

In this post I make some notes about static member functions in C++.

  1. Static member functions belong to and can be accessed via the class. Also, it can be accessed via objects of the class type, but it's not recommended.
  2. Static member functions have no implicit this pointer, since the this pointer always points to the object that the member function is working on.
  3. Static member functions can only access the names of static members, enumerators, and nested types of the class in which it is declared.
  4. Even when accessing static members in static member functions, this pointer is not allowed.
  5. Static member functions can be accessed via this pointer in other nonstatic member functions.
  6. Static member functions cannot be declared with the keywords virtual, const, or volatile.
  7. Static member functions cannot have the same name as a nonstatic member function that has the same argument types. In other words, the keyword static is not part of function signature.
  8. Pure static class (has static members and static membe functions only) is dangerous like global variables.
  9. Pure static class cannot have multiple copies without cloning and renaming.
  10. Operator overloading functions of a class cannot be static member functions
  11. Pointers to member cannot be used to point to any static member, including static member functions.
  12. The left side of a member-selection operator (. or ->) that selects a static member function is not evaluated. For example, the expression SideEffects().CountOf() does not call the function SideEffect.

I checked the last item and found some conflicting results. Here is an example (actually counter-example) for that item. In my example, the left side function call was actually executed, either passing arguments or not.

The output is:
0
in increaA
1
in increaA
2
in decreaA
1
in decreaA
0


References:
Microsoft MSDN library
IBM publib
LearnCpp.com

Swap 2 Variables

This problem can be seen everywhere, and almost every people knows the solution. In this post, I'm going to summarize 3 functions for swap.

Function 1: naiveSwap

Function 2: xorSwap

Function 3: addSwap

Summary:
naiveSwapxorSwapaddSwap
extra variableneednot neednot need
check \(x \ne y\) (address)not needneedneed
readabilitygoodnot goodnot good
overflowno problemno problempotential problem
aliasingfinecomplicatedcomplicated
execution modelinstruction pipelinestrictly sequentialstrictly sequential


Situations for XOR use in practice:
  • On a processor where the instruction set encoding permits the XOR swap to be encoded in a smaller number of bytes;
  • In a region with high register pressure, it may allow the register allocator to avoid spilling a register;
  • In Microcontrollers where available RAM is very limited.
Because these situations are rare, most optimizing compilers don't generate XOR swap code.

Because of the problem of overflow, the add-swap approach is used less often.


References:
Wikipedia item: XOR swap algorithm

Wednesday, October 26, 2011

Next Number in Order

Problem:
Given a number as string, output the next number in order. The digits you may use are only those already given, plus any number of 0's.

Example:
Given 1234, output 1243;
Given 4321, output 10234.

Solution:
In previous post, I discussed the ways using C++ library function next_permutation. What if we are not allowed to use this function? Can we write de novo code?

First, I tried Java. Since string in Java cannot be updated in place, and hence it needs many auxiliary variables, including StringBuffer objects that can be set at a specific index.

Here is the Java code.

Secondly, I tried to implement this function in C. Suppose the string given was stored in a char array, which is the C style.

Here is the C code.

Related Posts:
Next Number in Order with next_permutation

Highlight Section in Blog

Method:
Add corresponding CSS style into head section. Name classes meaningful.

Here is a simple example ...
======================================================

Hello, World!
I just want to highlight this section. I just want to highlight this section. I just want to highlight this section. I just want to highlight this section.

Also, I want to highlight this word only. Also, I want to highlight this word only. Also, I want to highlight this word only.

This is a table aligned to left:
112345
123451

This is a table aligned to center:
112345
123451

Goodbye ...

Monday, October 24, 2011

Next Number in Order

Problem:
Given a number as string, output the next number in order. The digits you may use are only those already given, plus any number of 0's.

Example:
Given 1234, output 1243;
Given 4321, output 10234.

Solution:
The best approach is to make use of the C++ next_permutation template function from <algorithm> library. This template can work on a string and update it in place to the next string in lexicographical order. If the given string is already the largest one, then the function returns false and changes the string to the smallest one; otherwise, returns true and modifies the string to next one in order.

Code 1: first call next_permutation, then insert a zero if false returned.

Code 2: first insert a 0 to the beginning of the string, after permutation, remove it if it's still at the beginning.

It's better to look into the magic next_permutation template. When not allowed to use such library function, we may need write code similar to that.

Related Posts:
Next Number in Order without next_permutation

Nonrecursive Algorithm to Inorder Traverse Binary Tree

In previous post, I discussed the nonrecursive algorithm with stack to inorder traverse a binary tree. That's also pretty straightforward. In fact, we can implement the inorder traversal of binary trees without stack. This is more complicated but elegant.

For this approach, I use one more pointer to record the subtree already visited. Also, we need keep in mind that we are doing INORDER traversal. Therefore, if the right subtree of a node has been visited, the node itself must have been visited.


Related Posts:
Nonrecursive Algorithm with Stack to Inorder Traverse Binary Tree

Nonrecursive Algorithm to Inorder Traverse Binary Tree

The recursive algorithm is simple and straightforward. Now I am discussing the nonrecursive version. One solution is to use an auxiliary stack, with time and space both \(O(n)\) in worst case.



The above code can be modified a little bit as shown below. These 2 versions will work exactly the same.


Related Posts:
Nonrecursive Algorithm without Stack to Inorder Traverse Binary Tree

Sunday, October 23, 2011

Create VBA Modules in Excel

Very often that people want to make some operations not supported by Excel itself. Yes, some friends asked me about this, and I also jumped into such situation sometimes. From a perspective of programmer, all operations like this can be done with some programming work. For general Excel users, they may not like programming and like some interactive way instead. Moreover, it's more straightforward to read and write files if working with Excel.

Here, I summarize a (hopefully) simple procedure to add customized VBA modules in Excel. Just for those who are interested.

  1. Show "Developer" menu in Excel Ribbon
  2. File → Options → Customize Ribbon, check "Developer" and click OK
  3. Open a blank Excel workbook, and save as .xlam format in the default directory
  4. For Office 2010 on Windows 7, the default directory is
    C:\Users\[username]\AppData\Roaming\Microsoft\AddIns
  5. Make the add-in available in Excel
  6. Developer → Add-Ins, check or browse to locate the add-in you just created
  7. Open VBA editor
  8. Developer → Visual Basic, or ALT + F11, and then select proper project
  9. Insert and edit customized module
  10. In VBA editor, Insert → Module, then edit or copy VBA code
  11. Save and exit
If you want to add the add-in to Excel Ribbon, you need create some XML markup first. I prefer not to do so, because I don't want to spend time doing that.

To use the add-in, after opening the data file on which you want to make operations, open VBA editor as above, then select the proper module and click Run Macro (F5).

I attach a sample VBA module code to delete the lower triangular entries of a matrix.


Reference:
Microsoft MSDN Library

Collections in Java

First keep in mind that the elements of collections in Java should be reference type only, not primitive type.

There are four types of collections and each type includes several classes.
TypeClasses
ListArrayList, LinkedList, CopyOnWriteArrayList
SetHashSet, TreeSet, LinkedHashSet, CopyOnWriteArraySet, ConcurrentSkipListSet
MapHashMap, TreeMap, LinkedHashMap, ConcurrentHashMap, ConcurrentSkipListMap
QueueLinkedList, PriorityQueue, SynchronousQueue, DelayQueue, ArrayBlockingQueue, LinkedBlockingQueue, PriorityBlockingQueue, ConcurrentLinkedQueue

Basic approach to choosing a collection:
  1. Select the collection type;
  2. Select specific class of that type, the one with extra functionality as few as possible.

Step 1: Select collection type
TypeFunctionalityTypical uses
List
  • Essentially a variable-size array;
  • You can usually add/remove items at any arbitrary position;
  • The order of the items is well defined (i.e. you can say what position a given item goes in in the list).
Most cases where you just need to store or iterate through a "bunch of things" and later iterate through them.
Set
  • Things can be "there or not"— when you add items to a set, there's no notion of how many times the item was added, and usually no notion of ordering.
  • Remembering "which items you've already processed", e.g. when doing a web crawl;
  • Making other yes-no decisions about an item, e.g. "is the item a word of English", "is the item in the database?" , "is the item in this category?" etc.
Map
  • Stores an association or mapping between "keys" and "values"
Used in cases where you need to say "for a given X, what is the Y"? It is often useful for implementing in-memory caches or indexes. For example:
  • For a given user ID, what is their cached name/User object?
  • For a given IP address, what is the cached country code?
  • For a given string, how many instances have I seen?
Queue
  • Like a list, but where you only ever access the ends of the list (typically, you add to one end and remove from the other).
  • Often used in managing tasks performed by different threads in an application (e.g. one thread receives incomming connections and puts them on a queue; other "worker" threads take connections off the queue for processing);
  • For traversing hierarchical structures such as a filing system, or in general where you need to remember "what data to process next", whilst also adding to that list of data;
  • Related to the previous point, queues crop up in various algorithms, e.g. build the encoding tree for Huffman compression.


Step 2: Select specific class

List
ClassFeatures/implementationWhen to use
ArrayList
  • Allows elements to be efficiently read by index.
  • Adding/removing the last element is efficient.
  • Not synchronized in any way.
In most cases.
LinkedList
  • First and last elements can be accessed efficiently;
  • Other elements cannot be efficiently accessed by index;
  • Not synchronized in any way.
Effectively, functions as a non-synchronized queue. In practice, rarely used: when you need a queue, you often need it to be concurrent or to provide other functionality; other implementations are often more useful.
CopyOnWriteArrayList
  • Allows safe concurrent access;
  • Reads are efficient and non-blocking;
  • Modifications are not efficient (since a brand new copy of the list is taken each time).
Where you need concurrent access and where frequency of reads far outweights frequency of modifications.

Set
Ordering of keysNon-concurrentConcurrent
No particular orderHashSet
SortedTreeSetConcurrentSkipListSet
FixedLinkedHashSetCopyOnWriteArraySet

Map
Ordering of keysNon-concurrentConcurrent
No particular orderHashMapConcurrentHashMap
SortedTreeMapConcurrentSkipListMap
FixedLinkedHashMap

Queue
Blocking?Other criteriaBoundNon-bound
BlockingNoneArrayBlockingQueueLinkedBlockingQueue
Priority-based PriorityBlockingQueue
Delayed DelayQueue
Non-blockingThread-safe ConcurrentLinkedQueue
Non thread-safe LinkedList
Non thread-safe, priority-based PriorityQueue


Reference:
http://www.javamex.com/tutorials/collections/how_to_choose.shtml

Saturday, October 22, 2011

Nested Class or Function Definition?

(Short Notes)

Class composition is of course allowed for object-oriented programming languages. Also, the nested definition for classes is allowed in C++, Java and Python (from my own experience, and other languages may also support this).

But for nested function definition, there are some differences. For C++ and Java, neither allows nested function definition. However, Python does allow nested definition for functions, but such nested functions are only available within the outer function body, unlike the nested class definition which can be accessed from other scopes if defined properly (like defined as public).

Of course, nested function calls are always allowed.

Update:
Function defined in class?
==> member functions

Class defined in function?
==> local classes (link)

In this sense, it's allowed to define member functions of a local class in an enclosing function.

References:
IBM publib

Related Posts:
Class Templates in C++

Wednesday, October 19, 2011

Can Virtual Functions Be Overloaded?

Polymorphism and virtual functions are powerful strategies in C++ for object-oriented programming. Also, function overloading is very convenient if multiple functions work similarly but take different parameters. So, can these two happen along with each other? If yes, how will this work?

In fact, virtual functions, as other functions, CAN be overloaded within the class defining them, in addition to the override option for derived classes. Even for pure virtual functions, they can have multiple signatures in abstract classes! The caveat for overloading pure virtual functions is that every virtual function signature must be implemented in each derived class for instantiation. Otherwise, it will be a compile error.

I wrote a program to illustrate and test virtual functions, including their overriding and overloading. Also, I attached the output below the program (You may need click and enlarge it to see clear output). At the end, I summarized the overloading of virtual functions.

Source code:

Output (click and enlarge to see it clearer):

Summary:
  1. Virtual functions can be overloaded in base or derived classes, like the code lines 6 - 16.
  2. The qualifier "virtual" is not part of function signature, and therefore 2 functions with "virtual" difference only are not allowed by compiler.
  3. Virtual functions can be overloaded by other virtual functions, as well as non-virtual functions, like the code lines 18 - 20. Function vfoo(double,int) is not virtual, which can be demonstrated by the code lines 132 - 137 when i = 4. Only base class version of function vfoo(double,int) was executed.
  4. Pure virtual functions can also be overloaded in the abstract class body, like the code lines 25 - 27.
  5. Pure virtual functions can also be overloaded by regular virtual functions or non-virtual functions, examples not shown in the given code.
  6. Pure virtual functions need to be implemented in derived classes for instantiation. Compiler requires that each pure virtual function signature be implemented, not just the one with signature match.
  7. If a derived class of an abstract class implements the virtual function, its own child classes can access its implementation if the inheritance is public.
  8. If multiple overloaded virtual functions (including pure virtual functions) exist in base class, but only one or some of the function signatures are overridden in derived class, all other un-overridden (including non-virtual) function signatures will be hidden and unavailable in derived class, even the inheritance is public. If uncommenting any lines of 103, 104, 108, or 109, it will be a compile error.
  9. However, if none of such virtual functions are overridden in derived class, all overloaded versions are available in derived class if the inheritance is public (like the code lines 58 - 59).
  10. If calling functions via base class type pointers or references, all base class overloaded functions, if not overridden in derived class, are available, like the code lines 132 - 137.
  11. Parameter types may be automatically casted to the type declared in the function parameter list, like the code lines 102 and 106.
  12. Default arguments work normally for virtual functions, as in regular functions, like the code lines 111 - 114 and 116 - 119.
  13. Ambiguous function calls are not allowed by compiler. Code line 126 will cause a compile error if uncommented.
  14. Default arguments will be ignored if functions are called via base class type pointers or references, like the code line 139.

Monday, October 17, 2011

Convenient Way to Edit Tables in Blogger?

Sometimes I need edit tables in my posts, and found that the HTML coding for tables, especially when I want various features, is tedious. I searched online to see if there are some good ways to do this. Unfortunately, I haven't found any.

As for now, WYSIWYG style is preferred for me to edit tables. So I edit the tables in other editor like Dreamweaver, and then just copy the HTML code back to my blog. I wish there is some better way I can handle tables!

Are there some convenient ways for TABLEs in blog?


================================================================

After I posted above content, I did find that if I can use CSS style well, the table can be edited in a controllable way. Also, the code wasn't be prohibitive, and I can play with it in various ways. Glad that it won't be an obstacle for me to continue with my blog! Hopefully I can learn more along with blogging!

This is a sample table I just edited. Yes, if I really know something, it will be simple.

Header1Header2Header3
Row1, Col1Row1, Col2Row1, Col3
Row2, Col1Row2, Col2Row2, Col3
Row3, Col1Row3, Col2Row3, Col3
Row4, Col1Row4, Col2Row4, Col3
Row5, Col1Row5, Col2Row5, Col3

I think I need some summary, so just post here.
  1. Style can be made for entire table, each row, each column, or each cell.
  2. Style options for columns are limited, and maybe overruled by row styles.
  3. Some options for styles:
    background-color, color, text-align, font-face, font-size, font-weight, text-decorator

Array Declaration and Initialization in C++ and Java

There are several ways to declare and initialize arrays, including the pointers in C++. The syntax for C++ and Java is kind of similar, but different and sometimes confusing. There are no pointers in Java, but the array variables in Java work like pointers in the sense that they both store the addresses and may change. The array names in C++ are constants and fixed at the time of declaration.

For 2- or higher dimensional arrays, the "array constant" way in C++ will give real arrays, and each array will take a contiguous block in memory. Because of this, multiple-dimension arrays can be treated as one-dimension in C++. For the same reason, programmers need know the size of array at the time of declaration. However, for Java arrays, the lengths of each element in a 2-D array can be totally different, where each element is a 1-D array. Hence, in Java, multiple-dimension arrays cannot be treated as one-dimension at all.

Here, I list and compare the ways to declare and initialize arrays in C++ and Java.

Statements C++ Java
int array[5]; Y N
int[5] array; N N
int array[]; N Y
int[] array; N Y
int array[5] = {1,2,3,4,5}; Y N
int array[] = {1,2,3,4,5}; Y Y
int[] array = {1,2,3,4,5}; N Y
int array[5] = new int[5]; N N
int array[] = new int[5]; N Y
int array[] = new int[5]{1,2,3,4,5}; N N
int array[] = new int[]{1,2,3,4,5}; N Y
int* array; Y N
int* array = new int[5]; Y N
int* array = new int[]{1,2,3,4,5}; N N
int* array = new int[5]{1,2,3,4,5}; N N
int* array1, array2; array2 is int N
int[] array1, array2; N array2 is int[]
int []array1, array2; N array2 is int[]
int array1[], array2; N array2 is int
int []array1, []array2; N N
int []array1, array2[]; N array2 is int[][]

Summary:
  1. In C++, array name is a constant; in Java, array name is a variable.
  2. In C++, pointers can be used for an array; in Java, there are no pointers.
  3. In C++, brackets must appear after array name for declaration and access; in Java, brackets can appear before the array name for declaration.
  4. In C++, if the array is not initialized during declaration, the size must be specified; in Java, no constants can exist within brackets when declaration.
  5. In C++, initialization cannot be done using new for an array (not pointer here); in Java, this is possible, like the initialization of a pointer in C++.
  6. In Java, if use new to initialize an array, the size in brackets and the value array cannot appear at the same time; otherwise, Java compiler doesn't know how to decide the size.
  7. In C++, if use new to initialize a pointer array, no customized array can be specified. Instead, all elements of the array will be initialized with the default constructor. If the type doesn't have a default constructor, this is a compile-time error.
  8. In C++, the * after type only specifies the first pointer variable; in Java, the [] after type specifies all array variables in that declaration line.
  9. In Java, if declare more than one variables in a single line, all brackets for the second to last variables must be placed after the variable name, and the brackets before the first variable will apply to all variables in the line.

Related Posts:
new and delete
Object Initialization

Sunday, October 16, 2011

First Pair Summing to Given Number

Problem:
Given an array of integers, find the first pair whose sum equals the given number.

Example:
Given array [14, 8, 5, 7, 4, 1, 20] and number 12, return (5, 7).

Clarification:
The term "first pair" means the both 2 numbers appear before the latter of any other pair. In above example, the first pair is (5, 7) instead of (8, 4).

Thinking:
I discussed the brute force way to solve this problem (post here). The code is simple and it runs in \(O(n^2)\) time. Why can't we use sorting first? Because it will destroy the appearance order? Do we have a way to keep track of the appearance order?

The answer is YES! We can use an auxiliary array to keep track of the appearance during sorting. After sorting, this array can be used to find the first pair we need. Sorting is always a good approach. At the same time, we should notice that this algorithm is \(O(n\log n)\) in time but \(O(n)\) in space, while the brute-force approach only costs \(O(1)\) space. This is a common trade-off.

In the code below, I assume that we don't know the range of the numbers, so we can only rely on the comparison-based sorting algorithms, which means it's unable to achieve \(O(n)\) complexity in time.


Related Posts:
Brute Force Approach to Solve this Problem

First Pair Summing to Given Number

Problem:
Given an array of integers, find the first pair whose sum equals the given number.

Example:
Given array [14, 8, 5, 7, 4, 1, 20] and number 12, return (5, 7).

Clarification:
The term "first pair" means the both 2 numbers appear before the latter of any other pair. In above example, the first pair is (5, 7) instead of (8, 4).

Thinking:
A typical way to find if there are 2 elements in an array such that their sum is a given number is that sort the array first, and then use an auxiliary array and 2 pointers (or something alike, like array indices), scanning the 2 arrays once. The bottleneck of this approach is sorting, and its complexity is \(O(n\log n)\).

But the problem here is to find the first pair, which increases the difficulty. At a glance, sorting may destroy the order of elements in original array, and if doing so, we'll have no idea which pair is the right one. Following this acknowledgement, we may only use a brute-force approach to check each pair by order. Although an auxiliary array can be made use of, but that will also take \(O(n^2)\) in time, with no real improvement.

Although there IS a better way (post here) to do this problem, this post just shows the brute-force code and it's not so ugly.

Related Posts:
\(O(n\log n)\) Approach to Solve this Problem

Binary Search Heavier Ball

Given a line of balls, of which one is heavier and all others are the same.
Also, given a function called balance, which works on an array and tells whether the front part or back part is heavier.

Problem: find the heavier ball.

Thinking: it is natural to consider binary search approach. Yes, and it is also efficient with \(O(\log n)\) complexity in time. But there are some subtle problems with the function "balance". This function works like a scale and can only tell which side is heavier. If the heavier side has more balls than the other, we don't know whether this is due to the extra ball or due to the heavier ball. Hence, if we want the function "balance" to work well in this problem, extra energy is needed to guarantee that the balls on each side are equal in terms of number.

Assume the prototype of balance function is
int balance(int A[ ], int l, int r, int p);
The array A represents the array of balls to be balanced, where l is the left end of the array, r is the right end of the array, and p is the partition point where l, l+1, ..., p will be seen as the front part, and p+1, p+2, ..., r the back part. The return value can be 1, -1, or 0. 1 means front part is heavier, -1 means back part is heavier, and 0 means front part and back part weight the same.

Based on this, we can have following solution (code):

To solve the original problem, we need call searchHeavier(A,0,n-1) (assuming there are n balls).

The code above was a recursive function. It's easy to convert it to an iterative version (not shown here).

More thinking about this problem:
If we don't know for sure there is a ball heavier than all other balls. In other words, there is at most one ball which is heavier and all other balls are the same. In this case, only one change need be made to above code: is the remaining ball really heavier? The modified code is as below:

Function Signature in C++

The Standard defines the signature of a function to include the following at 1.3.10:

the types of its parameters and, if the function is a class member, the cv- qualifiers (if any) on the function itself and the class in which the member function is declared. The signature of a function template specialization includes the types of its template arguments. (14.5.5.1)

It's missing the return type in this definition, which is part of the signature of a function template specialization (i.e a function declaration that declares a function which is a specialization of a template), as pointed out by 14.5.5.1 (recent C++0x working papers fixed that already to mention the return type in 1.3.10 too):

The signature of a function template specialization consists of the signature of the function template and of the actual template arguments (whether explicitly specified or deduced).

The signature of a function template consists of its function signature, its return type and its template parameter list.


The most recent C++0x draft n3000 says about "signature" in 1.3.11, and it is:

the name and the parameter type list (8.3.5) of a function, as well as the class or namespace of which it is a member. If a function or function template is a class member its signature additionally includes the cv-qualifiers (if any) and the ref-qualifier (if any) on the function or function template itself. The signature of a function template additionally includes its return type and its template parameter list. The signature of a function template specialization includes the signature of the template of which it is a specialization and its template arguments (whether explicitly specified or deduced).

Some examples:

int func( void, int );
int func( int );
int func( void )
int func( int, void );
// valid

int func( void );
void func( void );
// invalid

int func( void );
void func( int );
// valid

template<typename T> int func( void );
template<typename T> void func( void );
// valid

Reference:
http://stackoverflow.com/questions/290038/is-the-return-type-part-of-the-function-signature


Related Posts:
Function Template Specialization

Saturday, October 15, 2011

Space Restricted String Hashing

Source: CLRS, Introduction to Algorithms, 2nd Edition, Page 236, Problem 11.3-2

Problem:
Suppose that a string of r characters is hashed into m slots by treating it as a radix-128 number and then using the division method. The number m is easily represented as a 32-bit computer word, but the string of r characters, treated as a radix-128 number, takes many words. How can we apply the division method to compute the hash value of the character string without using more than a constant number of words of storage outside the string itself?

Basics:
This problem requires some basic number theory principles, and more specifically modulo arithmetic. The following 2 will be applied to solve this problem. (The second one holds for integers only)
\begin{equation} (a+b)\%m=(a\%m+b\%m)\%m \end{equation} and \begin{equation} (a \cdot b )\%m=((a\%m) \cdot (b\%m))\%m \end{equation}

Solution:
Based on above analysis, the string of r characters can be hashed using division method with constant space. Any string of length r can be converted to a radix-128 number, say, \[n=c_{0} \cdot 128^{0}+c_{1} \cdot 128^{1}+c_{2} \cdot 128^{2} + \ldots + c_{r-1} \cdot 128^{r-1},\] where \(c_{i}\) means the \(i\)th character starting from the least significant end with 0 (which can also be viewed as an integer of its ASCII code).

Hence, apply the division hashing function, \begin{eqnarray} n\%m &=& (c_{0} \cdot 128^{0}+c_{1} \cdot 128^{1} + \ldots + c_{r-1} \cdot 128^{r-1})\%m \\ &=& ((c_{0} \cdot 128^{0})\%m+(c_{1} \cdot 128^{1})\%m + \ldots + (c_{r-1} \cdot 128^{r-1})\%m)\%m \\ \end{eqnarray} However, it looks that we need first compute some large numbers, for example, \(128^{r-1}\) of the last term. If so, we still need many words to store them. But based on the second rule mentioned above, we don't have to store such large numbers. All we need to do is take the modulo operation after each multiplication. In this way, only constant number of words for storage is needed.

Here is the code.


However, the above code involves nested loops, which is unnecessary. Based on Horner's rule, we can rewrite \(n\) as \[n=c_0+128(c_1+128(c_2+...+128(c_{r-2}+128\cdot c_{r-1}) \ldots))\]
This is also the basis for Rabin-Karp algorithm. Now, the code can be improved as below.

Friday, October 14, 2011

LaTeX Style with Blogger

The other day when I edited mathematical formula with Blogger for the first time, I tried the online latex editor credited by CodeCogs (http://www.codecogs.com/latex/eqneditor.php). It is good for formulae in separate lines from texts and center alignment; however, for inline formulae or equations, they cannot be beautifully aligned with texts.

I explored some more and found that mathjax server is offering such functions I want. All I need to do is add an HTML/Javascript gadget and copy the code for that gadget, or copy the code directly into the <head> section of Template HTML script. The code was found in this forum seed: http://tex.stackexchange.com/questions/13865/how-to-use-latex-on-blogspot. Also, I am attaching it here.

Here I am trying some examples.
This is an inline equation \(3x^2-2x+17=0\) and an inline formula \(\mu=\frac{f}{N}\).
Also, a separate equation is here \[\frac{dy}{dx}=y^2-5y+\sqrt{y}+f(x)\]

They look much better to me. Great gratitude to MathJax (http://www.mathjax.org)!

Direct Addressing on Huge Array

Source: CLRS, Introduction to Algorithms, 2nd Edition, Page 223, Problem 11.1-4

Problem:
We wish to implement a dictionary by using direct addressing on a huge array. At the start, the array entries may contain garbage, and initializing the entire array is impractical because of its size. Describe a scheme for implementing a direct-address dictionary on a huge array. Each stored object should use O(1) space; the operations SEARCH, INSERT, and DELETE should take O(1) time each; and the initialization of the data structure should take O(1) time.

Hint:
Use an additional stack, whose size is the number of keys actually stored in the dictionary, to help determine whether a given entry in the huge array is valid or not.

Solution:
Simply speaking, given an uninitialized huge array, implement a dictionary.
Besides the huge array, one more array can be used to store the actual objects (or their pointers). Call these 2 arrays H and S, respectively. Initially, the arrays contains no records and define a variable n = 0. For a certain key k, if k already exists in the dictionary, we want to keep this relationship: H[k] = i and S[i] = k (or S[i] points to the object with key k). Therefore, H[ S[i] ] = i and S[ H[k] ] = k should both hold for a valid record.

Now we design the dictionary operations:
Initialize: build an empty stack.
Search: given a key k, test 1 ≤ H[k] ≤ n && S[ H[k] ] == k, if true, found and return S[ H[k] ]; otherwise, not found.
Insert: if key k already exists, replace it; otherwise push k into S, say, S[n+1] ← k and n ← n+1, and update the dictionary by H[k] ← n (already incremented).
Delete: delete the object with key k. First search to make sure the record is in the dictionary. If found, after simply deleting the record, there will be a gap in the entry with index H[k] in array S. Need fill in this gap while maintaining the correct relationship between H and S. This involves the following steps:
  • S[ H[k] ] ← S[n]
  • H[ S[n] ] ← H[k]
  • H[k] ← NULL
  • S[n] ← NULL
  • n ← n-1
All these operations can be done in time O(1).

References:
http://ripcrixalis.blog.com/2011/02/08/hello-world/
http://www.cs.duke.edu/courses/summer02/cps130/Homeworks/Solutions/H11-solution.pdf

Filter Text Values in Excel


Sometimes, friends ask me how to extract or get rid of some rows from an Excel spreadsheet. I did something similar before but also need some research to figure out the right way. Of course we can use a programming language and regular expression to do this task, but Excel has already provided such functions. Here I am posting to make a memorandum in case of future use.

In my current version Office 2010 (and I believe it's also the case for Office 2007), Excel provides a way to filter data based on customized criteria. Just follow "Data"->"Filter"->"Advanced". For numerical criteria, users can use relational operations in the criteria cell. This posts focuses on text values instead, and there are many options as listed below.

Notes:
The column names should exist and match for data area and criteria area.
Multiple criteria fields can be placed in adjacent columns.
Criteria in the same row means logical AND, and in different rows means logical OR.
* and ? are wild cards for this use and different from their functions in regular expression.
More combinations are possible based on the rules given below.

="=text"Select cells whose contents are exactly equal to the string "text"
<>textSelect cells whose contents are not equal to the string "text"
textSelect cells whose contents begin with the string "text"
>textSelect cells whose contents are ordered (alphabetically) after the string "text"
*text*Select cells whose contents contain the string "text"
text*textSelect cells whose contents begin with the string "text" AND contain a second occurrance of the string "text"
="=text*text"Select cells whose contents begin with the string "text" AND end with the string "text"
?textSelect cells whose contents begin with any single character, followed by the string "text"
="=text?text"Select cells whose contents begin with the string "text" AND end with the string "text" AND contain exactly one character between these two strings
="=???"Select cells whose contents contain exactly 3 characters

Thursday, October 13, 2011

Horse Race

Question:
Given 25 horses and 5 horses in a race, find the minimum number of races needed to find top 3 horses.

Answer:
7
Step 1: divide 25 horses into 5 groups, each group a race;
Step 2: pick the horses in 3rd place from each group above, then race and suppose get A > B > C > D > E (here '>' means faster);
Step 3: pick the first 3 horses from the group containing A and first 2 horses from group containing B, run a final race.

Below is an illustration.

Reverse a Singly Linked List

Problem: reverse a singly linked list.

Three cases:
  1. Acyclic list
  2. Completely cyclic list (all nodes are in the cycle/loop)
  3. Partially cyclic list (at least one node not in the cycle/loop)

This post discusses Case 3.
Case 3 is some different from Case 1 and Case 2. In some sense, we cannot give an algorithm that will for sure do what you want.

Attempt 1:
Since the first node (starting from head) in the cycle has 2 incoming pointers and 1 outgoing pointer, if we literally reverse every link in the list, this will violate the nature of singly linked list. Actually, there is no sufficient pointer fields to record necessary information.

Attempt 2:
If we define the node whose next points to the first circular node as the tail node of the list, and simply ignore the link from it to the first circular node, we will break the cycle and return a general acyclic list. In addition, more memory will be needed to test if it reaches some node that has appeared before.

Attempt 3:
If simply follow the procedure for reversing an acyclic list as in Case 1, there won't be any error in the running. The problem following is that not all links in the list are reversed, since those links before the loop will be reversed twice, and as a result, unchanged.

Personally, the problem for this case is still open. People may feel Attempt 3 would be the best. But not perfect either!

Related Posts:
Discussion about Case 1
Discussion about Case 2

Structure of Tree Node

Background:
If each node of an arbitrary tree contains 3 pointer fields: left-child, right-sibling, and parent, from any node in the tree, its parent can be reached and identified in constant time and all its children can be reached and identified in time linear in the number of children.

Question:
How to use only 2 pointers and 1 boolean value in each node so that the parent of a node or all of its children can be reached in time linear in the number of children.
(Source: CLRS, Introduction to Algorithms, 2nd Edition, Page 217, Problem 10.4-6)

Answer:
Use 2 pointers left and right, where left always points to its leftmost child or NIL for leaves, and right generally points to right sibling but for rightmost node it points back to the parent. Then the boolean variable should record whether the node is rightmost or not. In this way, each parent node and its children form a loop in the tree (parent's left pointer and all children's right pointer), and they can be reached in time linear in the number of children.

Reverse a Singly Linked List

Problem: reverse a singly linked list.

Three cases:
  1. Acyclic list
  2. Completely cyclic list (all nodes are in the cycle/loop)
  3. Partially cyclic list (at least one node not in the cycle/loop)
This post discusses Case 2.
Requirement: O(n) in time and O(1) in space.

Case 2 is similar to Case 1, except for the test condition of list tail and the corresponding update to original head.

Here is the code for this case.


Related Posts:
Discussion about Case 1
Discussion about Case 3

Wednesday, October 12, 2011

Use 2 Fuses to Measure Time

Given 2 fuses, each can be burned in exactly 1 hour. However, both fuses are not uniform and we have no clue which part burns faster and which slower.

Question: how to measure 45 minutes using these 2 fuses?

Answer:
First measure 30 min, and then measure 15 min.
How to measure 30 min? Light a fuse from both ends at the time.
How to measure 15 min? That's half of 30 min. So light a fuse from both ends after it has burned 30 min.

Final answer is:
1. First light 2 ends of one fuse and 1 end of the other fuse at the same time (time 0 now);
2. At the time one fuse is burned up, light the other end of the remaining fuse (30 min now);
3. When the second fuse is burned up, it's 45 min.

Formula -- CodeCogs

Testing how to insert formulae into posts. The tool used here is online latex equation editor credited by CodeCogs (http://www.codecogs.com/latex/eqneditor.php).

Here is a simple example.

Noticed that there should be some other and more convenient ways to insert formulae into posts. Will learn more in the future.

Acknowledgment to CodeCogs.com!