Programming Question 2 | CS 240E

In this programming question, you will be implementing two data structures, and analyzing their performance (in terms of the number of block transfers to/from storage).

You will write two programs that model your friend list. As you are a computer scientist, you have reduced your friends to numerical abstractions, or to integer key-value pairs, the value being the utility that friend brings to you. (We’re too lazy to model friends by their names).

Only make friends if they can do something for you – R. Evans, CS 247

~~(or is this only for software engineers)?~~

You’re lucky: you’re so personable that you don’t lose friends. (Also, I’d get hate mail if I made you implement B-tree deletion). Therefore, the only operations your program needs to support are:

insert K V; inserts a friend with name K and utility V into your friendabase. After running this command your program prints out the block that friend is stored in.
search K; returns and prints the utility of friend K (when you need to buy Christmas presents). After running this command you will print out the value of that friend.

I have provided two starter files, block.h and block.cpp, which form the interface to the storage which backs your friendabase. You’re so popular that I have given you access to 2^28 integers of storage to hold your numerous friends, organized into blocks of size 2^12 integers! Unfortunately, I am in need of money, and have therefore sold out CS 240E, sponsored by Robbers Communications, owner of the Toronto Failing Leafs. Therefore all block transfes will be on the Robbers(TM) 5G network and hence you will therefore be paying 1 BTC per block transferred.

Your programs will run in a loop receiving operations from standard input, one input on a line. When you run out of operations (when reading from standard input produces EOF) you will stop your program and print out the number of BTC you owe Robbers.

Since you are an enterprising computer scientist (or cause I said so), you will be implementing two variants of your friendabase:

An BST backed implementation. You are free to efficiently (or inefficiently) use blocks for storing binary search tree nodes. (Yon only need to submit a normal BST now)
A B-tree backed implementation, where a full B-tree node should fit in one block. (For a 4096-int-sized-block, well, an item has 2 ints and is separated by ints for pointers).

After you have implemented both programs, well, I hope you made some test cases. Run your programs on representative test cases of size 2^{5…24} many friend insertions and 2^{5…24} friend lookups, and graph the amount of BTC you owe for each program as a function of the number of total operations. Describe in a short writeup (at-most-1-short-paragraph) why your programs behave they way they do.

Submit your test cases named {n-insert-m-queries.in}, your output named {n-insert-m-queries-bst.out}{n-insert-m-queries-b.out}, the starter files and the source of both programs {bst-friend.cpp, b-friend.cpp} to Marmoset. Submit your graphs and writeup to Crowdmark.

Input format:
The input file contains many lines, each line is either:

insert K V, where K is the key (friend name) and V is the value (utility)
search K, where K is the key (friend name) to search for

Output format:
Contains many lines for each insert (the block number where the friend is stored) and search (the utility of the friend), with the last line being the total number of block transfers. .out test cases format:
The .out file should contain one number, the total BTC owed for the operations performed.

Stdin / .in test cases example:

insert 1 10
insert 2 20
search 1
search 2
insert 3 30

Stdout example:

.out test cases example:

What should the program print?
utility of friend K for each query and the total number of transfers.

What should the .out test cases I submit contain?
Only the number of transfers.

Bonus marks:

Implement AVL rotations for part (a), or implement any other kinda of balanced binary tree, there will be a leader board sorted by # of block accesses. The #1 #2 place will get 25% bonus mark, #3 #4 #5 will get 20%, #6-#10 will get 15%, and rank > #10 will get 10%. The tree you submitted must be a balanced binary tree for bonus mark. Submit avl-friend.cpp.

Implement B+ tree. it worths 10% if your output to our test cases correct. If you submit both bonus, then the bonus mark for B+ tree will be 5%. Submit bplus-friend.cpp.

For both bonus marks, you don't need to make the test cases and writeup, but you need to make sure your program works correctly. They will each have a seperated marmoset project. You can do both and the bonus marks will sum up, except the B+ tree bonus mark will be 5% instead of 10%.
You can check more details and Q&A on Piazza.