The final project is a requirement only for graduate students taking CS 651.
The topic of the final project can be on anything you wish in the space of big data. Anything reasonably related to topics that we covered in the course is within scope. For reference, there are three types of projects you might consider:
You may work in groups of up to three, or you can also work by yourself if you wish. The amount of effort devoted to the project should be proportional to the number of people in the team. As a guideline, the level of effort should be comparable to two assignments per person.
When you are ready, send me an email describing what you'd like to work on. I will provide you with feedback on appropriateness and scope of your proposed project. The "soft" deadline for this proposal is Nov 11, 2024. There is no penalty if you miss this deadline, but it is in your best interest to not leave this proposal to the last minute.
The deliverable for the final project is a report. Use the ACM Templates, or something similar. That is, there is no specific template you must use, but it should look professional. You can use 1 or 2 columns per page. The contents of the report will vary depending on the type of project you are doing. However, it should certainly describe the goal of you project (what is your learning objective, or what problem are you trying to solve), your methodology, and some kind of evaluation of your results or progress. There are no hard limits on the length of your final report, but you should target something in the range of 5-10 pages.
Your final project will be evaluated according to the following criteria, with roughly equal weight placed on each one.
Your report should clearly indicate where you obtained any data that you used in your project. Include a link to the data if possible.
A note about "Evaluation": If you are implementing an algorithm (or several) in MapReduce/Spark, then evaluation will primarially be about determining the correctness of your implementation. (E.g. you should be testing your results to be sure that it actually works!) If you are learning a new framework then you should include a section on what you've learned. (You should also implement a simple algorithm to demonstrate what you've learned, and of course the results should be tested!).
The deadline for submission of your project report is 11pm on the last day of classes. As you are grad students I'm allowed to make this a "soft" deadline if you have a reason that you need more time. (This does not use a flex day! I will very likely say yes so if you need the time just ask!)