CS 514, Algorithms, Fall 2022
HW9 - Graphs (part 2), DP (part 4)

Due Monday Nov 21, 11:59pm. 
No late submission will be accepted.

Include in your submission: report.txt, dijkstra.py, nbest.py.
dijkstra.py will be graded for correctness (1%).

Textbooks for References:
[1] CLRS Ch. 22 (graph)
[2] my DP tutorial (up to page 16):
    http://web.engr.oregonstate.edu/~huanlian/slides/COLING-tutorial-anim.pdf
[3] DPV Ch. 3, 4.2, 4.4, 4.7, 6 (Dasgupta, Papadimitriou, Vazirani)
    https://www.cs.berkeley.edu/~vazirani/algorithms/chap3.pdf
    https://www.cs.berkeley.edu/~vazirani/algorithms/chap4.pdf
    https://www.cs.berkeley.edu/~vazirani/algorithms/chap6.pdf
[4] KT Ch. 6 (DP)
    http://www.aw-bc.com/info/kleinberg/assets/downloads/ch6.pdf
[5] KT slides: Greedy II (Dijkstra)
    http://www.cs.princeton.edu/~wayne/kleinberg-tardos/

***Please answer time/space complexities for each problem in report.txt.

1. [WILL BE GRADED]
   Dijkstra (see CLRS 24.3 and DPV 4.4)
   
   Given an undirected graph, find the shortest path from source (node 0)
   to target (node n-1). 
   
   Edge weights are guaranteed to be non-negative, since Dijkstra doesn't work
   with negative weights, e.g.
 
       3
   0 ------ 1   
     \    /
    2 \  / -2
       \/
       2
   
   in this example, Dijkstra would return length 2 (path 0-2), 
   but path 0-1-2 is better (length 1).

   For example (return a pair of shortest-distance and shortest-path):
   
       1
   0 ------ 1   
     \    /  \
    5 \  /1   \6
       \/   2  \
       2 ------ 3
            
   >>> shortest(4, [(0,1,1), (0,2,5), (1,2,1), (2,3,2), (1,3,6)])
   (4, [0,1,2,3])

   If the target node (n-1) is unreachable from the source (0),
   return None:

   >>> shortest(5, [(0,1,1), (0,2,5), (1,2,1), (2,3,2), (1,3,6)])
   None

   Another example:

      1          1
   0-----1    2-----3

   >>> shortest(4, [(0,1,1), (2,3,1)])
   None

   Tiebreaking: arbitrary. Any shortest path would do.

   Filename: dijkstra.py

   Hint: you need a priority queue implementation that supports the classical "Decrease-Key" operation in textbooks. Unfortunately, most implementations (Python's heapq and C++/Java's) do not support it, because you need to combine a hash table (dict) and a priority queue, which is non-trivial. Luckily, there are several very good implemenations in Python:

   [1] heapdict:
   https://raw.githubusercontent.com/DanielStutzbach/heapdict/master/heapdict.py

   >>> from heapdict import heapdict
   >>> h = heapdict()
   >>> h['a'] = 3
   >>> h['b'] = 1
   >>> h.peekitem()
   ('b', 1)
   >>> h['a'] = 0
   >>> h.peekitem()
   ('a', 0)
   >>> h.popitem()
   ('a', 0)
   >>> len(h)
   1
   >>> 'a' in h
   False
   >>> 'b' in h 
   True

   [2] pqdict:
   https://github.com/nvictus/priority-queue-dictionary

   You can also use:
   https://classes.engr.oregonstate.edu/eecs/fall2022/cs514-001/hw9/pqdict.py

   from pqdict import pqdict

   [3] priority_dict:
   https://gist.github.com/matteodellamico/4451520#file-priority_dict-py

   Note: this is depecrated (Python2), and I've converted it to Python3 and made a few other changes to make the interface identical to heapdict and pqdict; please use:

   https://classes.engr.oregonstate.edu/eecs/fall2022/cs514-001/hw9/priority_dict.py

   from priority_dict import priority_dict


   They all share the exact same interface; you can use any one in your code (all these tools are installed in the grader).

   Speed: priority_dict < pqdict << heapdict ("<" means faster)   

   Q: What if you only have heapq, can you still make Dijkstra work (with the built-in heapq)? 
      Can you re-analyze the time/space complexities?

   Q: Is Dijkstra a greedy algorithm or dynamic programming algorithm?
      Most textbooks (such as KT and CLRS) classify it as a greedy algorithm,
      but some people have different opinions, e.g.:

      (a) https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm#Dynamic_programming_perspective
      (b) Dijkstra's algorithm revisited: the dynamic programming connexion, by M. Sniedovich (2006):
      http://matwbn.icm.edu.pl/ksiazki/cc/cc35/cc3536.pdf
      (c) https://www.quora.com/Is-Dijkstras-Algorithm-a-greedy-algorithm-or-a-dynamic-programming-algorithm
      (d) https://aclanthology.org/C08-5001.pdf
      
      What do you think?

   Q: for problems that can be solved by both Dijkstra and Viterbi, which one is faster? 


2. Traveling Salesman Problem (TSP, CLRS 34.5.4, KT 8.5).

   See also:
   https://en.wikipedia.org/wiki/Held%E2%80%93Karp_algorithm
   http://www.math.nagoya-u.ac.jp/~richard/teaching/s2020/Quang1.pdf   
   

   Given an undirected graph of n nodes (0..n-1) representing a road network,
   the traveling salesman has to start from city 0 and visit each city
   once and only once, and return to city 0. Find the minimum-length tour (cycle)
   that satisifies these conditions (this is also called "Hamiltonian Cycle").

   Write the subproblem definition, recurrence relation, and space/time complexities in report.txt.

   Input: same as Dijkstra
   Output: (cycle_length, cycle_list)
   Tiebreaking: arbitrary

   E.g., for the above example in Dijkstra, one possible best cycle is 0-1-3-2-0, with a cost of 14.

       1
   0 ------ 1   
     \    /  \
    5 \  /1   \6
       \/   2  \
       2 ------ 3

   >>> tsp(4, [(0,1,1), (0,2,5), (1,2,1), (2,3,2), (1,3,6)])
   (14, [0,1,3,2,0])

   If we add an edge (3,0,1), then the best cycle cost reduces to 5:

   >>> tsp(4, [(0,1,1), (0,2,5), (1,2,1), (2,3,2), (1,3,6), (3,0,1)])
   (5, [0,1,2,3,0])

   
   Hints:

   1. The subproblem should be like best[S, v] where S is the subset of nodes visited and v is the last node visited in S. This way you can keep track of which nodes have been visited and make sure not to revisit any node:

   ({0}, 0) --> ({0,1}, 1) --> ({0,1,3}, 3) --> ({0,1,2,3}, 2) --> ...

   2. You might find it convenient to add a dummy sink node n (which is a clone of node 0) so the TSP path would end in node n.
   
   3. This DP solution essentially constructs another (much bigger) graph G' based on the input graph G, where each node in G' looks like (S, v). TSP can now be viewed as shortest-path in G' from ({0}, 0) to ({0..n}, n).

   4. This problem can be solved by either Viterbi (simpler) or Dijkstra. The classical Bellman-Held-Karp TSP algorithm (see Wikipedia) is an instance of the former.

   5. To implement the set S, you can use Python's frozenset (because set is not hashable), or you can use bit operations (recommended). Note that Python integers are automatically high-precision, so any n is fine (unlike C/Java where integers have restricted ranges), e.g.:

   >>> 1<<100 # 2^100
   1267650600228229401496703205376


   Additional real-world examples: # from a map of germany: https://stackoverflow.com/questions/11007355/data-for-simple-tsp

   >>> tsp(11, [(0,1,29),(0,2,20),(0,3,21),(0,4,16),(0,5,31),(0,6,100),(0,7,12),(0,8,4),(0,9,31),(0,10,18),
                (1,2,15),(1,3,29),(1,4,28),(1,5,40),(1,6,72),(1,7,21),(1,8,29),(1,9,41),(1,10,12),
                (2,3,15),(2,4,14),(2,5,25),(2,6,81),(2,7,9),(2,8,23),(2,9,27),(2,10,13),
                (3,4,4),(3,5,12),(3,6,92),(3,7,12),(3,8,25),(3,9,13),(3,10,25),
                (4,5,16),(4,6,94),(4,7,9),(4,8,20),(4,9,16),(4,10,22),
                (5,6,95),(5,7,24),(5,8,36),(5,9,3),(5,10,37),
                (6,7,90),(6,8,101),(6,9,99),(6,10,84),
                (7,8,15),(7,9,25),(7,10,13),
                (8,9,35),(8,10,18),
                (9,10,38)])

   (253, [0, 8, 10, 1, 6, 2, 5, 9, 3, 4, 7, 0])

   (Viterbi: 0.02s; Dijkstra: 0.02s)


   Random examples:

   >>> tsp(16, [(1, 2, 0), (11, 5, 5), (9, 8, 4), (6, 1, 4), (5, 13, 5), (12, 11, 4), (14, 8, 0), (0, 11, 3), (10, 12, 3), (5, 5, 1), (7, 0, 1), (10, 5, 1), (11, 5, 3), (13, 11, 4), (11, 11, 3), (5, 12, 5), (14, 7, 3), (8, 15, 4), (11, 14, 3), (11, 14, 3), (7, 10, 5), (5, 8, 3), (9, 9, 5), (13, 9, 5), (6, 15, 4), (11, 2, 2), (0, 6, 5), (3, 1, 4), (1, 8, 4), (7, 3, 4), (4, 8, 1), (6, 1, 3), (1, 1, 2), (11, 5, 1), (0, 2, 0), (2, 0, 0), (0, 11, 2), (4, 5, 5), (5, 0, 3), (1, 7, 1), (1, 0, 2), (3, 9, 2), (15, 0, 2), (14, 1, 2), (12, 4, 3), (7, 2, 5), (10, 3, 0), (14, 4, 4), (12, 15, 4), (10, 4, 2), (8, 8, 4), (13, 0, 5), (4, 1, 2), (1, 4, 1), (5, 3, 3), (7, 1, 1), (7, 14, 0), (8, 2, 4), (7, 11, 2), (13, 8, 4), (0, 4, 0), (12, 13, 1), (3, 2, 1), (3, 3, 0), (5, 7, 0), (6, 0, 4), (14, 14, 2), (12, 6, 5), (6, 13, 3), (0, 1, 3), (5, 3, 5), (15, 11, 0), (3, 11, 2), (11, 9, 0), (13, 3, 0), (9, 6, 5), (0, 14, 0), (13, 15, 3), (6, 2, 0), (9, 0, 2), (9, 2, 1), (15, 6, 0), (11, 12, 5), (14, 4, 2), (12, 3, 2), (3, 3, 0), (10, 12, 1), (3, 0, 4), (15, 1, 5), (15, 9, 2), (14, 4, 2), (8, 15, 4), (15, 13, 3), (9, 12, 1), (5, 15, 4), (8, 13, 5), (2, 3, 0), (11, 5, 4), (4, 13, 0), (2, 1, 1)])
   
   (6, [0, 4, 8, 14, 7, 5, 10, 3, 13, 12, 9, 11, 15, 6, 2, 1, 0])

   (Viterbi: 1.0s, Dijkstra: 0.1s)

   Q: space and time complexity?

   Filename: tsp.py


Debriefing (required!): --------------------------

0. What's your name?
1. Approximately how many hours did you spend on this assignment?
2. Would you rate it as easy, moderate, or difficult?
3. Did you work on it mostly alone, or mostly with other people?
4. How deeply do you feel you understand the material it covers (0%-100%)? 
5. Any other comments?

This section is intended to help us calibrate the homework assignments. 
Your answers to this section will *not* affect your grade; however, skipping it
will certainly do.