Cornell Systems Lunch
CS 7490 Spring 2018
The Systems Lunch is a seminar for discussing recent, interesting papers in the systems area, broadly defined to span operating systems, distributed systems, networking, architecture, databases, and programming languages. The goal is to foster technical discussions among the Cornell systems research community. We meet once a week on Fridays at noon in Gates 114.
The systems lunch is open to all Cornell Ph.D. students interested in systems. First-year graduate students are especially welcome. Non-Ph.D. students have to obtain permission from the instructor. Student participants are expected to sign up for CS 7490, Systems Research Seminar, for one credit.
Links to papers and abstracts below are unlikely to work outside the Cornell CS firewall. If you have trouble viewing them, this is the likely cause.
|January 26||Meltdown and Spectre
|February 2||Don’t Wrangle. Just Guess
Every few months there’s another article: "The system" labels a 12-year old
as a terrorist or credit agencies ruin someone’s life due to a mixup. These
problems all occur because data management as we know it is designed for
perfect data. Huge amounts of time, effort, and money are spent on what has
come to be known as "Wrangling," or fine-tuning data cleaning pipelines to
preserve this illusion of perfection. In this talk, I propose a different
strategy: Let the database guess. I’ll introduce a system we’re building
called Mimir, which takes a principled approach to heuristic guesswork.
Mimir combines techniques from provenance, probabilistic query processing,
and compilers to provide a lightweight, easy-to-use, backwards-compatible
framework for helping users understand the impact of heuristic guesses on
query results and reports. In particular, I will focus on one of Mimir’s
query evaluation strategies, which implements a taint-based approximation of
probabilistic query processing that we call annotated-best-guess (ABG)
semantics. I’ll sketch some of our formal results about the tightness of
the ABG approximation, show how this approach outperforms classical
probabilistic query processing techniques, and finally outline some of
the other projects being developed as part of Mimir.
(This work is supported by NSF Award ACI-1640864, NPS Award #N00244-16-1-0022,
and gifts from Oracle)
Bio: Oliver Kennedy has been an assistant professor at the University at Buffalo, SUNY since 2012. He earned his PhD from Cornell University in 2011 and now leads the Online Data Interactions (ODIn) lab, which operates at the intersection of databases and programming languages. Oliver is the recipient of the UB SEAS Early Career Teacher of the Year Award, and UB CSE’s Outstanding Funding Award, and has several papers invited to "Best of" compilations from SIGMOD and VLDB. Oliver’s ODIn lab is currently exploring uncertain data management, just-in-time data structure design, and "small data" management.
|February 9||Attack and Defense of Trusted Execution Environment
Securing highly complex networked computing systems is a challenging task. This can be observed from the fact that despite significant efforts from government, industry and academia, new vulnerabilities continue to be discovered every day. As computing systems become more integrated with the society, there is an urgent need for trustworthy systems.
In this talk, I will describe my research efforts to build trustworthy computer systems. I will begin by briefly discussing my understanding of computer security from both the industry and academic perspectives, then summarize several challenges and opportunities in secure systems. I will then detail my work on realizing trusted execution environment that can defend against advanced attackers launching multi-vector attacks on embedded devices. These efforts lead to not only the design and implementation of the cache-assisted secure execution system on ARM processors, but also studies on the fundamental limitations of these security mechanisms. By addressing the key challenges in embedded devices, my works pave the way for the further proliferation of trusted execution environments for mission critical applications. To conclude the talk, I will briefly present my research vision for computer security in the future.
Bio: Dr. Ning Zhang is currently a technical lead at Cyber Security Innovations of Raytheon. He is also an Adjunct Assistant Professor at the Department of Computer Science of Virginia Polytechnic Institute and State University since 2016. He has worked to protect military systems and critical infrastructures at Raytheon since 2007. Ning\rsquo;s research focus is system security, which lies at the intersection of security, embedded system, computer architecture and software. Ning received his Ph.D. degree from the Complex Networks and Security Research Lab at Virginia Polytechnic Institute and State University in 2016, under the supervision of Dr. Wenjing Lou. Ning received his M.S. in System Engineering from Worcester Polytechnic Institute, M.S. in Computer Science and B.S. in Computer Science, Economics and Mathematics from the University of Massachusetts - Amherst.
|Ning Zhang (VT)|
|February 16||Privacy in an Era of Big Surveillance and Data
Businesses and governments are increasingly using sophisticated monitoring technologies to mine sensitive user information and perform mass surveillance. Anonymity systems such as the Tor network aim to protect user identity in online communications. Anonymity enables freedom of speech and communications on the Internet, which is an essential tool for our democratic society.
In this talk, I will consider the unique perspective of network level adversaries, such as Internet service providers, and their impact on the security of anonymity systems including Tor. First, I will present Raptor attacks (USENIX Security 2015), a suite of attacks that exploit structural properties of Internet routing and vulnerabilities in inter-domain routing (BGP) to compromise the privacy of Tor clients. Second, I will discuss Counter-Raptor defenses (IEEE S&P 2017), our proposed defenses for mitigating the threat of active routing attacks on Tor. Overall, the talk motivates rethinking our communication systems to mitigate network-level adversaries that can exploit and manipulate Internet routing.
Bio: Prateek Mittal is an Assistant Professor in the Department of Electrical Engineering at Princeton University, where he is also affiliated with the Center for Information Technology Policy. His research aims to design and develop privacy-preserving networked systems. A unifying theme in his work is to manipulate and exploit structural properties of networked systems to solve privacy challenges facing our society. His research has applied this distinct approach to widely-used operational systems, and has used the resulting insights to influence system design and operation, including that of the Tor network and the Let’s Encrypt certificate authority, directly impacting hundreds of millions of users. He is the recipient of faculty research awards from IBM (2017), Intel (2016, 2017), Google (2016, 2017), and Cisco (2016), the NSF CAREER award (2016), Princeton University’s E. Lawrence Keyes, Jr. award for outstanding research and teaching (2017), and Princeton innovation award (2015, 2017, 2018). He has received several outstanding paper awards, including at ACM CCS, and has thrice been named on the Princeton Engineering Commendation List for Outstanding Teaching (2014, 2015, 2016). He serves on the editorial board of the Privacy Enhancing Technologies Symposium (PETS), and has co-chaired the workshops on Free and Open Communications on the Internet (FOCI) and Hot Topics in Privacy Enhancing Technologies (HotPETS).
|Prateek Mittal (Princeton)|
|February 23||PebblesDB: Building Key-Value Stores using Fragmented Log-Structured Merge Trees
Pandian Raju, Rohan Kadekodi, Vijay Chidambaram, and Ittai Abraham (UT Austin, VMware Research)
|March 2||Cancelled due to snow storm
||The Incredible Snow Man|
|March 9||LITE Kernel RDMA Support for Datacenter Applications
Shin-Yeh Tsai and Yiying Zhang (Purdue)
|March 16||Practical Near-optimal Coflow scheduling
Saksham Agarwal, Shijin Rajakrishnan (Cornell), Akshay Narayan (MIT), Rachit Agarwal, David Shmoys (Cornell), Amin Vahdat (Google)
(unpublished manuscript---please do not distribute)
|March 23||Identifying, exploiting, and preventing unintended memorization in Deep Neural Networks
Ulfar Erlignsson (Google)
For the last several years, Google has been leading the development and real-world deployment of state-of-the-art, practical techniques for learning statistics and ML models with strong privacy guarantees for the data involved. (For example, this includes the RAPPOR and Prochlo mechanisms for learning statistics in the Chromium and Fuschia OSS projects.) In this talk, he will start by presenting a new, easy-to-apply "exposure" metric which allows estimating the magnitude of problems that result when such protective techniques are *not* applied—and which can be utilized to extract individual secrets, such as social security numbers (see https://arxiv.org/abs/1802.08232).
|Ulfar Erlingsson (Google)|
|March 30||An Empirical Study on the Correctness of Formally Verified Distributed Systems
Pedro Fonseca, Kaiyuan Zhang, Xi Wang, and Arvind Krishnamurthy (UW)
|April 6||Spring Break, no meeting.|
|April 13||ACSU Luncheon no systems lunch, no meeting.|
|April 20||Horizontally Scalable Strongly Anonymous Communication
Srini Devadas (MIT)
To protect users’ privacy in the age of mass-surveillance, there have been many works on secure communication that use end-to-end encryption to protect the content of the communication, like Signal and Let’s Encrypt. Unfortunately, encryption does little to hide the metadata of the communication (such as when and with whom a user is communicating) and provides virtually no privacy when one of the end points is compromised. In this talk, I will describe our efforts to design scalable anonymous communication networks to address this issue. I will first present Atom, a scalable anonymous communication network with strong anonymity properties. Atom allows its users to anonymously send messages, and protects the senders’ identities against powerful adversaries. At the same time, Atom is able to scale easily to more users by simply adding more servers to the network, unlike most existing systems with similar security properties. Then, I will describe some limitations of Atom, and present Quark, our ongoing effort. By using much more efficient crypto and a new routing protocol, Quark that can provide similar guarantees to Atom with significantly less overhead.
Atom is joint work with Albert Kwon, Bryan Ford and Henry Corrigan-Gibbs. Quark is joint with Albert Kwon.
Biography: Srini Devadas is the Webster Professor of EECS at MIT where he has been on the faculty since 1988. His current research interests are in computer security, computer architecture and applied cryptography. Devadas received the 2017 IEEE Wallace McDowell award for his research in secure hardware. He is the author of "Programming for the Puzzled" (MIT Press, 2017), a book that builds a bridge between the recreational world of algorithmic puzzles and the pragmatic world of computer programming, teaching readers to program while solving puzzles. Devadas is a MacVicar Faculty Fellow and an Everett Moore Baker teaching award recipient, considered MIT’s two highest undergraduate teaching honors.
|Srini Devadas (MIT)|
|April 27||State machine replication and the modern exchange
Electronic exchanges play an important role in the world’s financial system, acting as focal points where actors from across the world meet to trade with each other. But building an exchange is a difficult technical challenge, requiring high transaction rates, low, determinstic response times, and serious reliability. We’ll look at the question of how to design an exchange through the lens of Concord, a system for building exchange-like systems that was developed at Jane Street. Concord is designed from the ground up around state machine replication, a classic distributed systems technique. This choice has profound affects on the resulting system, providing a simple framework for building a reliable platform, while at the same time requiring very careful performance engineering to make it work effectively. We’ll discuss the pros and cons of the design, and consider the lessons it provides for other transaction processing systems.
|Yaron Minsky (Jane Street)|
|May 4||Automatic Clustering at Snowflake
Jiaqi Yan (Snowflake)
For partitioned tables, maintaining good clustering properties for frequently accessed dimensions is critical for partition pruning performance. Naive methods of clustering maintenance could be expensive, especially when the clustering dimensions are different from the dimensions with which the data is loaded. On the other hand, approximate clustering is cheaper to maintain while still resulting in good pruning performance. In this talk, I will present Snowflake’s clustering capabilities, including our algorithm for incremental maintenance of approximate clustering of partitioned tables, as well as our infrastructure to perform such maintenance automatically. I will also cover some real-world problems we run into and our solutions.
|Jiaqi Yan (Snowflake)|