CS 325, Algorithms, Spring 2023
HW4 - Priority Queue and Heaps
Due via the submit program on Monday May 1, 11:59pm.
No late submission will be accepted.
Need to submit: report.txt, nbest.py, datastream.py, team.py.
datastream.py will be graded for correctness (1%).
To submit:
flip $ /nfs/farm/classes/eecs/spring2023/cs325-001/submit hw4 report.txt {nbest,datastream,team}.py
(You can submit each file separately, or submit them together.)
To see your best results so far:
flip $ /nfs/farm/classes/eecs/spring2023/cs325-001/query hw4
Textbooks for References:
[1] CLRS Ch. 6
[2] KT slides for binary heaps (only read the first 20 pages!):
https://www.cs.princeton.edu/~wayne/kleinberg-tardos/pdf/BinomialHeaps.pdf
[3] Python heapq module
0. [DIDN'T COVER IN CLASS; BUT YOU CAN READ TEXTBOOKS; WE WILL GO OVER THEM NEXT WEEK]
There are two methods for building a heap from an unsorted array:
(1) insert each element into the heap --- O(nlogn) -- heapq.heappush()
(2) heapify (top-down) --- O(n) -- heapq.heapify()
Derive these time complexities.
1. Given two lists A and B, each with n integers, return
a sorted list C that contains the smallest n elements from AxB:
AxB = { (x, y) | x in A, y in B }
i.e., AxB is the Cartesian Product of A and B.
ordering: (x,y) < (x',y') iff. x+y < x'+y' or (x+y==x'+y' and y>> a, b = [4, 1, 5, 3], [2, 6, 3, 4]
>>> nbesta(a, b) # algorithm (a), slowest
[(1, 2), (1, 3), (3, 2), (1, 4)]
>>> nbestb(a, b) # algorithm (b), slow
[(1, 2), (1, 3), (3, 2), (1, 4)]
>>> nbestc(a, b) # algorithm (c), fast
[(1, 2), (1, 3), (3, 2), (1, 4)]
Filename: nbest.py
2. [WILL BE GRADED]
Find the k smallest numbers in a data stream of length n (k<>> ksmallest(4, [10, 2, 9, 3, 7, 8, 11, 5, 7])
[2, 3, 5, 7]
>>> ksmallest(3, range(1000000, 0, -1))
[1, 2, 3]
[UPDATED -- lazy list (generator) input]
>>> ksmallest(5, (x**2 for x in range(10,0,-1)))
[1, 4, 9, 16, 25]
>>> import random; random.seed(1); ksmallest(5, (random.randint(0,100) for _ in range(10)))
[8, 15, 17, 32, 57]
Note:
a) it should work with both lists and lazy lists.
[UPDATE]
for both types of lists, you can always do:
for x in a:
...
but you can't do a[0] or a[-1] or len(a) on a lazy list.
A lazy list is most often a "generator", which can be created in two simple methods:
(1) yield statement (see HW3 xyz solutions for an example), or
(2) a "generator expression" (genexp), which is the lazy version of list comprehension:
a = [x**2 for x in range(10)] # a (non-lazy) list
b = (x**2 for x in range(10)) # a generator (lazy list)
the only difference is [...] vs (...)
for more details, see:
https://stackoverflow.com/questions/2776829/difference-between-pythons-generators-and-iterators
b) the output list should be sorted
Q: What is your complexity? Write down the detailed analysis in report.txt.
Filename: datastream.py
Note: The built-in function heapq.nsmallest() is _not_ allowed for this problem.
The whole point is to implement it yourself. :)
3. [FROM QUIZ 2 -- no need to base it on our skeleton in the quiz]
Given k sorted lists, each of length n, output the n smallest numbers of all these kn numbers.
Real world scenario: USA has k=50 states. Each state has chosen a state team of n=5 members.
Now we need to select Team USA of n=5 members for Olympics, out of these 50x5 candidates.
>> select([1, 5, 7, 9], [2, 4, 8, 10], [0, 3, 6, 9])
[0, 1, 2, 3]
>> select([16, 18], [0, 10], [5, 7], [2, 9], [11, 19], [8, 17], [6, 13], [1, 11], [14, 16], [1, 4])
[0, 1]
Note that select() takes arbitrary number of arguments. You should write:
def select(*a):
...
and then you can treat the variable "a" as a list of lists, e.g., k = len(a) or print(a[0][0]).
Filename: team.py
Q: What's the time complexity? Are you sure it's the optimal?
Hint: you can improve it a little bit, since most states don't have any one on Team USA!
No need to implement this improvement though.
Debriefing (required!): --------------------------
0. What's your name?
1. Approximately how many hours did you spend on this assignment?
2. Would you rate it as easy, moderate, or difficult?
3. Did you work on it mostly alone, or mostly with other people?
Note you are encouraged to discuss with your classmates,
but each students should submit his/her own code.
4. How deeply do you feel you understand the material it covers (0%-100%)?
5. Which part(s) of the course you like the most so far?
6. Which part(s) of the course you dislike the most so far?
This section is intended to help us calibrate the homework assignments.
Your answers to this section will *not* affect your grade; however, skipping it
will certainly do.