Students taking this course for 3 credits are required to design, carry out, and communicate the results of a small research project in information security. Projects are carried out in teams of two people. There are four components
to your project grade: a project proposal, a midterm status report, a final project report, and a project presentation.
In Fall 2005, students produced the following term projects:
Form groups (Due Sep 21, end of class)
Find a project partner and begin to discuss project ideas.
Project proposal (Due Oct 12, 10pm)
Your proposal should explicitly state the problem your project will address,
your project’s goal and motivation, related work, the methodology and plan for your project, and
the resources needed to carry out your project. Be sure to structure your plan as a set of incremental
milestones, and include a schedule for meeting them.
Midterm status report (Due Nov 8, 10 pm)
Your status report should contain enough implementation, data, and analysis
to show that your project is on the right track. You should include your original proposal with
instructor comments, along with any surprising results or changes in direction, schedule, etc. You
should also have a refined version of the problem statement and goals, as well as a more developed
related work section.
Final project report and presentation (Presentations due Dec 12, Papers due Dec 14)
A final report describing your research problem, your contributions, and analysis will be required. You should present your research problem, analysis,
and results to the class in a brief presentation and also in a final report. The presentation may
include a system demo if appropriate. The final report should include a paragraph explaining, for
each group member, their contributions and duties in the project.
Here are some project ideas. You may propose modifications to the projects below, or new project ideas, with approval of the instructor. (If you wish to do this, you should consult with the instructors before writing up the project proposal.)
Investigate creative and non-intrusive methods for protecting the unauthorized access to information on an RFID tag. Experiment with RFID hardware to duplicate tags. Build an RFID repeater or develop methods to effectively extend the read range on RFID tags. Develop an adversarial model for RFID security.
- Privacy v. Utility in k-anonymized data
Sweeney has proposed k-anonymization as a technique for providing privacy for published data. Anonymized data must also be useful to its recipients. Therefore privacy and utility are competing goals. The goal of this project is to investigate the trade-off between privacy and utility for real datasets, and to propose and analyze a novel measure of privacy and a novel measure of utility. You will begin by working with a real dataset, anonymizing it according to your own algorithm (or an algorithm proposed in the literature) and looking at the results. How can you characterize the privacy of the anonymized data? What k is reasonable? Does the parameter k serve as an accurate measure of privacy? How can you characterize the utility of the anonymized data?
Hash functions and digital signatures are useful for building secure software distribution. Done correctly, the update can prevent an adversary from tricking a user into installing malicious software. But how secure are the update mechanisms for virus definition updates, Microsoft updates, Apple updates, Firefox updates, Debian updates, and other software update systems? Dump the network traffic to discover how these public or proprietary systems work. Analyze the weaknesses. Experiment with SSL man-in-the-middle to your own machine. Can you trick your machine into installing fake content or causing an error condition?
- Authenticated aggregate queries
Devanbu et al. describe techniques for authenticating simple database queries using hash trees. Design a hash tree that can authenticate aggregate queries like SUM, AVG, MIN, MAX over numerical attributes. For example, assume tuples consist of Employee(name, age, salary) fields. The goal is design a hash tree that permits clients verify that answers to queries like 'SELECT MAX(salary) FROM Employee' are correct. Evaluate the efficiency of your design analytically and experimentally for queries as well as updates to the database. Does your design require revealing extra information to the client that would not normally be revealed in the evaluation of an aggregate query?
- Performance of cryptography
Ten years ago it was difficult to run symmetric block ciphers fast. Now block ciphers and hash functions can exceed network throughputs. Public key encryption is no longer prohibitively expensive. Analyze past and new ciphers (e.g., block ciphers, traditional public key systems, elliptic curve crypto) to catalog the performance on various systems. AMD and Intel machines often produce dramatically different results. Figure out what causes the discrepancies and make predictions for the future performance of cryptography.
- Forensic analysis of database systems
Assume you are a forensic investigator targeting the use of a database system. What can you learn about the recent activities of database clients from inspecting the system? You should install an open-source database system (PostgreSQL or MySQL), load a benchmark database (eg TPCH) and execute some sample queries. Then inspect the system. What can you learn from the organization of files on disk, the system catalog, the existence and current state of indexes, and the transaction log. Distinguish between what you can learn with different access privileges.
- Forensics and History-independent data structures
A history-independent data structure (or the related oblivious data structure) is one whose state reveals nothing about the sequence of operations applied up until that point. For example, it should not be possible to infer from the state of a history independent dictionary structure that one item was inserted before another.
Investigate the relationship between history-independent data structures and computer forensics. Choose one or more data structures used in real systems and consider what their state may reveal about the past. One possibility is to investigate what is revealed by the state of a standard B-tree. Which of the two formulations of the privacy property (Micciancio '97, Naor/Teague '01) is more relevant to forensics? What is the likely practical cost of trying to make these structures history-independent?
Intro to oblivious data structures
Oblivious data structures, Micciancio [PDF]
Anti-persistence: history independent data structures, Naor/Teague [PDF]