CS 514, Algorithms, Fall 2022 HW4 - Priority Queue and Heaps Due via the submit program on Monday Oct 17, 11:59pm. No late submission will be accepted. Need to submit: report.txt, nbest.py, kmergesort.py, datastream.py. datastream.py will be graded for correctness (1%). To submit: flip $ /nfs/farm/classes/eecs/fall2022/cs514-001/submit hw4 report.txt {nbest,kmergesort,datastream}.py (You can submit each file separately, or submit them together.) To see your best results so far: flip $ /nfs/farm/classes/eecs/fall2022/cs514-001/query hw4 Textbooks for References: [1] CLRS Ch. 6 [2] KT slides for binary heaps (only read the first 20 pages!): https://www.cs.princeton.edu/~wayne/kleinberg-tardos/pdf/BinomialHeaps.pdf [3] Python heapq module 0. There are two methods for building a heap from an unsorted array: (1) insert each element into the heap --- O(nlogn) -- heapq.heappush() (2) heapify (top-down) --- O(n) -- heapq.heapify() (a) Derive these time complexities. (b) Use a long list of random numbers to show the difference in time. (Hint: random.shuffle or random.sample) (c) What about sorted or reversely-sorted numbers? 1. Given two lists A and B, each with n integers, return a sorted list C that contains the smallest n elements from AxB: AxB = { (x, y) | x in A, y in B } i.e., AxB is the Cartesian Product of A and B. ordering: (x,y) < (x',y') iff. x+y < x'+y' or (x+y==x'+y' and y>> a, b = [4, 1, 5, 3], [2, 6, 3, 4] >>> nbesta(a, b) # algorithm (a), slowest [(1, 2), (1, 3), (3, 2), (1, 4)] >>> nbestb(a, b) # algorithm (b), slow [(1, 2), (1, 3), (3, 2), (1, 4)] >>> nbestc(a, b) # algorithm (c), fast [(1, 2), (1, 3), (3, 2), (1, 4)] Filename: nbest.py 2. k-way mergesort (the classical mergesort is a special case where k=2). >>> kmergesort([4,1,5,2,6,3,7,0], 3) # k=3 [0,1,2,3,4,5,6,7] Q: What is the complexity? Write down the detailed analysis in report.txt. Filename: kmergesort.py 3. [WILL BE GRADED] Find the k smallest numbers in a data stream of length n (k<>> ksmallest(4, [10, 2, 9, 3, 7, 8, 11, 5, 7]) [2, 3, 5, 7] >>> ksmallest(3, range(1000000, 0, -1)) [1, 2, 3] Note: a) it should work with both lists and lazy lists b) the output list should be sorted Q: What is your complexity? Write down the detailed analysis in report.txt. Filename: datastream.py [UPDATE] The built-in function heapq.nsmallest() is _not_ allowed for this problem. The whole point is to implement it yourself. :) 4. (optional) Summarize the time complexities of the basic operations (push, pop-min, peak, heapify) for these implementations of priority queue: (a) unsorted array (b) sorted array (highest priority first) (c) reversly sorted array (lowest priority first) (d) linked list (e) binary heap 5. (optional, but will be covered in the Midterm!) Given k sorted lists, each of length n, output the n smallest numbers of all these kn numbers. Real world scenario: The United States has k=50 states. Each state has chosen a state tennis team of n=5 members. (these are the best players in each state). Now we need to select Team USA of n=5 members for Olympics, out of these 50x5 candidates. Note: most states would not have any one on Team USA! >> select([1, 5, 7, 9], [2, 4, 8, 10], [0, 3, 6, 9]) [0, 1, 2, 3] >> select([16, 18], [0, 10], [5, 7], [2, 9], [11, 19], [8, 17], [6, 13], [1, 11], [14, 16], [1, 4]) [0, 1] Note that select() takes arbitrary number of arguments. You should write: def select(*a): and then you can treat the variable "a" as a list of lists, e.g., k = len(a) or print(a[0][0]). Filename: team.py Q: what's the time complexity? Are you sure it's the optimal? (Think about how to improve it a little bit, since most states have no representation on Team USA!) Debriefing (required!): -------------------------- 0. What's your name? 1. Approximately how many hours did you spend on this assignment? 2. Would you rate it as easy, moderate, or difficult? 3. Did you work on it mostly alone, or mostly with other people? Note you are encouraged to discuss with your classmates, but each students should submit his/her own code. 4. How deeply do you feel you understand the material it covers (0%-100%)? 5. Which part(s) of the course you like the most so far? 6. Which part(s) of the course you dislike the most so far? This section is intended to help us calibrate the homework assignments. Your answers to this section will *not* affect your grade; however, skipping it will certainly do.