NYU, Graduate Division, Computer Science Course, CSCI-GA 3033-017
Modern societies are increasingly dependent upon the proper functioning of computer systems. Hardware and software safety and security is critical in many applications of computer science (for example in cyber-physical systems such as avionics, medical devices, automotive/transportation, but also in enterprise and service software such as accounting, banking, trading, social networking, etc.).
Yet, computer programs are riddled with flaws that at best are inconvenient and annoying, and at worst, have extremely damaging consequences. The Heartbleed buffer over-read security bug in the open-source OpenSSL cryptography library is a recent example among thousands of other less advertised ones.
This situation is specific to computer science since in other engineering disciplines engineers are penally responsible for there errors while programmers are not, and so no computer scientist is compelled to use best practices. A direct consequence is the use of cheap and ineffective practices such as debugging. ``Program testing can be used very effectively to show the presence of bugs but never to show their absence'' (E. W. Dijkstra).
In software and hardware engineering, formal methods aim at using the rigor of mathematics for the specification, development, and verification of reliable and robust software and hardware systems. Formal methods are now part of the academic and industry standards as shown for example by the DO-178 C primary document for certifying all commercial software-based aerospace systems. In particular, the DO-333 supplement to the DO-178 C discusses the use of sound formal methods as part of a software life cycle. So a firm foundation in formal methods are now an integral part of the core knowledge of every high-level professional computer scientist.
The ultimate objective of formal methods is to verify the correctness of computer systems and to have this verification completely automated, using computers. This objective is hard to achieve because of undecidability: no computer can correctly answer with finite time and memory resources every question relative to the correctness of an arbitrary computer system. Therefore all automated formal methods have to make compromises to escape this inevitable limitation of the power (including any form of ``intelligence'') of computer systems.
All automated formal methods must provide answers in finite time on undecidable questions and so have limitations:
It is therefore necessary to study formal methods not only as a recipe to get correct programs but also from a mathematical point of view to understand their respective principles. This is the role of abstract interpretation. Abstract interpretation originated from the need to formalize the soundness and generalize dataflow analysis and typing performed by compilers into fully automated program verification (and is often limited to this understanding). It turned out that the principles recognized by abstract interpretation where able to establish the foundations of all formal methods (preferably sound, but also to study the origin of unsoundness in incorrect ones).
The general idea is that to state anything formal on a computer system, it is necessary to rigorously define its semantics that is a model of all its possible behaviors in any possible execution environment. Then reasoning on the computer system is done on an abstraction of the properties of this semantics, which has to be derived, often by calculational design, from the semantics. Most often an inductive reasoning is needed, and possible thanks to abstract induction. So the study of formal methods amounts to the study of the abstractions that they are built upon. Examples of applications of the theory of abstract interpretation involve SMT solvers 1, proof methods 2, and static analysers 3;
The course will put an emphasis on the use of abstract interpretation to design proof methods and static analyzers, which now become part of the practice of companies developing huge softare (like Facebook, Google, Microsoft, Spotify, Uber, etc, see for example CodeContracts or Infer).
The course is opened to PhD students and Master students eager to study at the PhD level, which presupposes basic knowledge and includes the ability to read and the effective reading of recent, up to date, and possibly difficult research articles.
Therefore students are assumed to have previously successfully studied courses in mathematics (set theory and first-order logic>), programming languages, compilation, and operating systems, to have a good practice of computer programming in any high-level programming languages.
Students without this minimal background will not really benefit from the course.
Background mathematical material will be provided for self-study to make sure the students master the basics.
© P. Cousot 2017