Programming, Algorithm and Technical Tips: December 2011

Thursday, December 8, 2011

Python File Extensions

Now, I am going to work on a new project with Python! Although I know quite a little about Python 2.x, but Python 3 will be the trend and future, and I am going to use it this time. Hopefully will pick it up very soon!

As blog notes, I got some information about Python related file extensions. Please see this site for details about file extensions.

File Extension	File Type	File Description
.py	Developer	Program file or script written in Python Can be created and edited with a text editor, but requires a Python interpreter to run Often used for programming Web servers and other administrative computer systems
.pyc	Executable	Python complied file
.pyd	Developer	Python dynamic module Can be imported into Python programs
.pym	Developer	PYM macro preprocessor file Used by PYM, a macro preprocessor based on the Python programming language Contains macros defined between "#begin python" and "#end python" markers; used to create shortcuts for programmers PYM files store macro shortcuts that are expanded to their full text replacements during the Python interpreter's preprocessing stage
.pyo	Executable	Python optimized code Similar to a .pyc file, but assert statements and SET_LINENO instructions are removed
.pyw	Developer	Python GUI source file Displays a graphical user interface (GUI) May contain standard Python code as well as graphical Python code Must be run using pythonw instead of python in order to prevent a DOS console/terminal from popping up to display the output

References:

FileInfo.com

File Extension PYW

Tuesday, December 6, 2011

Loop & Its Entrance in Singly Linked List

Problem:
Given a singly linked list, check whether there is a loop in it. If yes, find the entrance to the loop.

Thinking:
Use 2 pointers, one fast and one slow. If at some time, these 2 pointers meet, then there is a loop; if fast pointer arrives at the end of the list, there is no loop.

Suppose we have above list structure. Assume loop length is \(r(r\in N)\), and the length before the loop (from \(A\) to \(B\)) is \(l\). Always, one can represent \(l\) as \(l=k\cdot r+p\), where \(k=0,1,2,...\) and \(p=0,1,2,...,r-1\).

Then, suppose the moving speed of "fast" is \(a\) times of "slow", where \(a>1\) and \(a\in N\). If "slow" moves one step a time, "fast" will move \(a\) steps a time.

If there is a loop, where will the 2 pointers meet? Suppose they will meet after "fast" finishes \(n\) loops and "slow" finishes \(m\) loops. And suppose also, at this moment, they are at point C, which is of \(q\) steps after the loop entrance \(B\). Here, \(n\ge1, m\ge0, m,n\in N, q=0,1,...,r\). Hence, the following equation holds: \[k\cdot r+p+n\cdot r+q=a(k\cdot r+p+m\cdot r+q)\] We need find values for \(m,n\) and \(q\) so that above equation holds. The equation can be rewritten as \[n\cdot r-a\cdot m\cdot r-k(a-1)r=(a-1)(p+q)\] We can always find a \(q\) such that \(p+q=r\), then \[n-am-k(a-1)=a-1\] \[n=am+(k+1)(a-1)\] For the first time they meet, \(m=0\) and then \[n=(k+1)(a-1)\] To be most efficient, \(a=2\) and \(n=k+1\).

Conclusion:
From above statement, we know that "fast" and "slow" will definitely meet during the first loop round for "slow". Also, at the moment they meet, they are at \(q=r-p\) steps after \(B\), which is equivalent to \(p\) steps before \(B\). Note that the head length \(l=k\cdot r+p\), which means that if two pointers start at \(A\) and \(C\) at the same time and both move one step a time, they will meet at the loop entrance \(B\) for the first time.

Below, I just show the code to find the loop entrance. Actually, the loop check is just part of and can be told from the return value of this program.

Are 2 Strings Anagrams or Not?

Problem:
Given 2 strings, determine if they are anagrams. Anagram means the 2 strings contain exactly the same characters and number of appearances. In other words, we can do a permutation to get the other string from one string.

Solution 1:
Sort the 2 strings first, then if they are anagrams, the sorted strings must be identical. Otherwise, they are not.

In C++, we may use the library function sort in the header < algorithm >. This function is comparison based and the average time complexity is \(O(n\log n)\).

Solution 2:
Since the character values are always in a fixed range, and in this case, counting sort is good approach. In this way, we may obtain \(O(n)\) time complexity and \(O(n)\) space complexity.

Solution 3:
Similar to the idea of counting sort, we don't actually sort the strings. All we need is the counts of characters appearing in the strings. If the counts for each character are equal, then the 2 strings are anagram. This solution also exhibit \(O(n)\) time complexity, but \(O(1)\) space complexity.

Solution 4:
Even more, we can just count the characters for one string, and make judgement while scanning the other. This approach also has \(O(n)\) time complexity and \(O(1)\) space complexity.
There is a bug in Solution 4. What if s1 is longer and contains all characters in s2? For example, s1="abcd", s2="ab".

Two possible solutions for this:

Add a length check at the beginning. Two strings must be of the same length to be anagram.
Add another round of check on the count array after scanning the second string. All entries should be 0.