Description: This summer research proposal inaugurated and defined the field. It contains the first use of the term artificial intelligence and this succinct description of the philosophical foundation of the field: "every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it." (See philosophy of AI) The proposal invited researchers to the Dartmouth conference, which is widely considered the "birth of AI". (See history of AI.)
Fuzzy sets
Lotfi Zadeh
Information and Control, Vol. 8, pp. 338–353. (1965).
IRE Convention Record, Section on Information Theory, Part 2, pp. 56–62, 1957
(A longer version of this, a privately circulated report, 1956, is online).
Description: The first paper written on machine learning. Emphasized the importance of training sequences, and the use of parts of previous solutions to problems in constructing trial solutions to new problems.
Description: Decision Trees are a common learning algorithm and a decision representation tool. Development of decision trees was done by many researchers in many areas, even before this paper. Though this paper is one of the most influential in the field.
Learning Quickly When Irrelevant Attributes Abound: A New Linear-threshold Algorithm
Description: One of the papers that started the field of on-line learning. In this learning setting, a learner receives a sequence of examples, making predictions after each one, and receiving feedback after each prediction. Research in this area is remarkable because (1) the algorithms and proofs tend to be very simple and beautiful, and (2) the model makes no statistical assumptions about the data. In other words, the data need not be random (as in nearly all other learning models), but can be chosen arbitrarily by "nature" or even an adversary. Specifically, this paper introduced the winnow algorithm.
Learning to predict by the method of Temporal difference
Description: Proving that weak and strong learnability are equivalent in the noise free PAC framework. The proof was done by introducing the boosting method.
Description: This paper presented support vector machines, a practical and popular machine learning algorithm. Support vector machines utilize the kernel trick, a generally used method.
Knowledge-based analysis of microarray gene expression data by using support vector machines
Description: The first application of supervised learning to gene expression data, in particular Support Vector Machines. The method is now standard, and the paper one of the most cited in the area.
Collaborative networks
Camarinha-Matos, L. M.; Afsarmanesh,H. (2005). Collaborative networks: A new scientific discipline, J. Intelligent Manufacturing, vol. 16, Nº 4–5, pp 439–452.
Camarinha-Matos, L. M.; Afsarmanesh,H. (2008). Collaborative Networks: Reference Modeling, Springer: New York.
Description: About grammar attribution, the base for yacc's s-attributed and zyacc's LR-attributed approach.
A program data flow analysis procedure
F.E. Allen, J. Cocke
Commun. ACM, 19, 137—147.
Description: From the abstract: "The global data relationships in a program can be exposed and codified by the static analysis methods described in this paper. A procedure is given which determines all the definitions which can possibly reach each node of the control flow graph of the program and all the definitions that are live on each edge of the graph."
A Unified Approach to Global Program Optimization
Gary Kildall
Proceedings of ACM SIGACT-SIGPLAN 1973 Symposium on Principles of Programming Languages.
Description: Formalized the concept of data-flow analysis as fixpoint computation over lattices, and showed that most static analyses used for program optimization can be uniformly expressed within this framework.
Description: The Colossus machines were early computing devices used by British codebreakers to break German messages encrypted with the Lorenz Cipher during World War II. Colossus was an early binary electronic digital computer. The design of Colossus was later described in the referenced paper.
Description: It contains the first published description of the logical design of a computer using the stored-program concept, which has come to be known as the von Neumann architecture.
Description: The IBM System/360 (S/360) is a mainframe computer system family announced by IBM on April 7, 1964. It was the first family of computers making a clear distinction between architecture and implementation.
The case for the reduced instruction set computer
DA Patterson, DR Ditzel
Computer ArchitectureNews, vol. 8, no. 6, October 1980, pp 25–33.
Description: The reduced instruction set computer( RISC) CPU design philosophy. The RISC is a CPU design philosophy that favors a reduced set of simpler instructions.
Comments on "the Case for the Reduced Instruction Set Computer"
Description: The Cray-1 was a supercomputer designed by a team including Seymour Cray for Cray Research. The first Cray-1 system was installed at Los Alamos National Laboratory in 1976, and it went on to become one of the best known and most successful supercomputers in history.
Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities
Gene Amdahl
AFIPS 1967 Spring Joint Computer Conference, Atlantic City, N.J.
Description: This paper discusses the concept of RAID disks, outlines the different levels of RAID, and the benefits of each level. It is a good paper for discussing issues of reliability and fault tolerance of computer systems, and the cost of providing such fault-tolerance.
The case for a single-chip multiprocessor
Kunle Olukotun, Basem Nayfeh, Lance Hammond, Ken Wilson, Kunyung Chang
Description: This paper argues that the approach taken to improving the performance of processors by adding multiple instruction issue and out-of-order execution cannot continue to provide speedups indefinitely. It lays out the case for making single chip processors that contain multiple "cores". With the mainstream introduction of multicore processors by Intel in 2005, and their subsequent domination of the market, this paper was shown to be prescient.
Description: A technique for image encoding using local operators of many scales.
Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images
Stuart Geman and Donald Geman
IEEE Transactions on Pattern Analysis and Machine Intelligence, 1984
Description: introduced 1) MRFs for image analysis 2) the Gibbs sampling which revolutionized computational Bayesian statistics and thus had paramount impact in many other fields in addition to Computer Vision.
Snakes: Active contour models
Michael Kass, Andrew Witkin, and Demetri Terzopoulos
International Journal of Computer Vision, 1(4):321–331, 1988. (Marr Prize Special Issue)
Description: This paper introduced the entity-relationship diagram(ERD) method of database design.
SEQUEL: A structured English query language
Donald D. Chamberlin, Raymond F. Boyce
International Conference on Management of Data, Proceedings of the 1974 ACM SIGFIDET (now SIGMOD) workshop on Data description, access and control, Ann Arbor, Michigan, pp. 249–264
Description: This paper introduced the SQL language.
The notions of consistency and predicate locks in a database system
K.P. Eswaran, J. Gray, R.A. Lorie, I.L. Traiger
Communications of the ACM 19, 1976, 624—633
Description: This paper defined the concepts of transaction, consistency and schedule. It also argued that a transaction needs to lock a logical rather than a physical subset of the database.
Federated database systems for managing distributed, heterogeneous, and autonomous databases
Description: The classic paper on Multics, the most ambitious operating system in the early history of computing. Difficult reading, but it describes the implications of trying to build a system that takes information sharing to its logical extreme. Most operating systems since Multics have incorporated a subset of its facilities.
A note on the confinement problem
Butler W. Lampson
Communications of the ACM, 16(10):613–615, October 1973.
Description: This paper addresses issues in constraining the flow of information from untrusted programs. It discusses covert channels, but more importantly it addresses the difficulty in obtaining full confinement without making the program itself effectively unusable. The ideas are important when trying to understand containment of malicious code, as well as aspects of trusted computing.
Description: The Unixoperating system and its principles were described in this paper. The main importance is not of the paper but of the operating system, which had tremendous effect on operating system and computer technology.
Weighted voting for replicated data
David K. Gifford
Proceedings of the 7th ACM Symposium on Operating Systems Principles, pages 150–159, December 1979. Pacific Grove, California
Description: This paper describes the consistency mechanism known as quorum consensus. It is a good example of algorithms that provide a continuous set of options between two alternatives (in this case, between the read-one write-all, and the write-one read-all consistency methods). There have been many variations and improvements by researchers in the years that followed, and it is one of the consistency algorithms that should be understood by all. The options available by choosing different size quorums provide a useful structure for discussing of the core requirements for consistency in distributed systems.
Experiences with Processes and Monitors in Mesa
Butler W. Lampson, David D. Redell
Communications of the ACM, Vol. 23, No. 2, February 1980, pp. 105–117.
Description: The file system of UNIX. One of the first papers discussing how to manage disk storage for high-performance file systems. Most file-system research since this paper has been influenced by it, and most high-performance file systems of the last 20 years incorporate techniques from this paper.
The Design and Implementation of a Log-Structured File System
David L. Black, David B. Golub, Daniel P. Julin, Richard F. Rashid, Richard P. Draves, Randall W. Dean, Alessandro Forin, Joseph Barrera, Hideyuki Tokuda, Gerald Malan, David Bohman
Proceedings of the USENIX Workshop on Microkernels and Other Kernel Architectures, pages 11–30, April 1992.
Description: This is a good paper discussing one particular microkernel architecture and contrasting it with monolithic kernel design. Mach underlies Mac OS X, and its layered architecture had a significant impact on the design of the Windows NT kernel and modern microkernels like L4. In addition, its memory-mapped files feature was added to many monolithic kernels.
An Implementation of a Log-Structured File System for UNIX
Description: The paper was the first production-quality implementation of that idea which spawned much additional discussion of the viability and short-comings of log-structured filesystems. While "The Design and Implementation of a Log-Structured File System" was certainly the first, this one was important in bringing the research idea to a usable system.
Soft Updates: A Solution to the Metadata Update problem in File Systems
Description: This paper describes the design and implementation of the first FORTRAN compiler by the IBM team. Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing.
Recursive functions of symbolic expressions and their computation by machine, part I[5]
Description: This paper introduced LISP, the first functional programming language, which was used heavily in many areas of computer science, especially in AI. LISP also has powerful features for manipulating LISP programs within the language.
B. Randell and L.J. Russell, ALGOL 60 Implementation: The Translation and Use of ALGOL 60 Programs on a Computer. Academic Press, 1964. The design of the Whetstone Compiler. One of the early published descriptions of implementing a compiler. See the related papers: Whetstone Algol Revisited, and The Whetstone KDF9 Algol Translator by B. Randell
Edsger W. Dijkstra, Algol 60 translation: an Algol 60 translator for the x1 and making a translator for Algol 60, report MR 35/61. Mathematisch Centrum, Amsterdam, 1961. [6]
Description: This series of papers and reports first defined the influential Scheme programming language and questioned the prevailing practices in programming language design, employing lambda calculus extensively to model programming language concepts and guide efficient implementation without sacrificing expressive power.
Description: Co-authored by the man who designed the C programming language, the first edition of this book served for many years as the language's de facto standard. As such, the book is regarded by many to be the authoritative reference on C.
Description: Written by the man who designed the C++ programming language, the first edition of this book served for many years as the language's de facto standard until the publication of the ISO/IEC 14882:1998: Programming Language C++ standard on 1 September 1998.
Wilkinson, J. H.; Reinsch, C. (1971). Linear algebra, volume II of Handbook for Automatic Computation. Springer. ISBN 978-0-387-05414-8.
Golub, Gene H.; van Loan, Charles F. (1996) [1983], Matrix Computations, 3rd edition, Johns Hopkins University Press;, ISBN 978-0-8018-5414-9
Computational linguistics
Booth, T. L. (1969). "Probabilistic representation of formal languages". IEEE Conference Record of the 1969 Tenth Annual Symposium on Switching and Automata Theory. pp. 74–81.
The first published description of computational morphology using finite state transducers. (Kaplan and Kay had previously done work in this field and presented this at a conference; the linguist Johnson had remarked the possibility in 1972, but not produced any implementation.)
Rabiner, Lawrence R. (1989). "A tutorial on hidden Markov models and selected applications in speech recognition". Proceedings of the IEEE77 (2): 257–286.
Brill, Eric (1995). "Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging". Computational Linguistics21 (4): 543–566.
Describes a now commonly used POS tagger based on transformation-based learning.
Manning, Christopher D.; Schütze, Hinrich (1999), Foundation of Statistical Natural Language Processing, MIT Press
Textbook on statistical and probabilistic methods in NLP.
This survey documents relatively less researched importance of lazy functional programming languages (i.e. Haskell) to construct Natural Language Processors and to accommodated many linguistic theories.
Description: The importance of modularization and information hiding. Note that information hiding was first presented in a different paper of the same author – "Information Distributions Aspects of Design Methodology", Proceedings of IFIP Congress '71, 1971, Booklet TA-3, pp. 26–30
in Dahl, Dijkstra and Hoare, Structured Programming, Academic Press, London and New York, pp. 175–220, 1972.
Description: The beginning of Object-oriented programming. This paper argued that programs should be decomposed to independent components with small and simple interfaces. They also argued that objects should have both data and related methods.
A technique for software module specification with examples
Description: A lovely story of how large software projects can go right, and then wrong, and then right again, told with humility and humor. Illustrates the "second-system effect" and the importance of simplicity.
Description: Statecharts are a visual modeling method. They are an extension of state machine that might be exponentially more efficient. Therefore, statcharts enable formal modeling of applications that were too complex before. Statecharts are part of the UML diagrams.
Theoretical computer science
Main article: List of important publications in theoretical computer science
Paris Kanellakis Award, a prize given to honor specific theoretical accomplishments that have had a significant and demonstrable effect on the practice of computing.
Tags: List of important publications in computer science, Informatics Science, 485, List of important publications in computer science This is a list of important publications in computer science organized by field, Some reasons why a particular publication might be regarded as important: Topic creator – A publication that created a new topic Breakthrough – A publication that changed scientific knowledge significantly Influence – A publication which has significantly influ, List of important publications in computer science, English, Instruction Examples, Tutorials, Reference, Books, Guide reader information science, pts-ptn.net