February 17th, 2007
When I graduated from college in 1965, I invested in two things to improve my chances of succeeding in my chosen career of computer software. First, I purchased an IBM Selectric typewriter; in fact, it was so expensive (a few hundred dollars) that I couldn’t just buy it in one fell swoop, and I had to negotiate a financing plan with IBM to pay it off at the rate of about $30 a month. There were no personal computers in those days, and an electric typewriter was about the most advanced form of personal technology I could get my hands on. I wrote my first three books on that typewriter, and carried it with me on business trips around the world until it finally fell apart in 1974. Things have changed a bit since then…
The second thing I did was join the Association for Computing Machinery (known to us computer geeks as the ACM), and I subscribed to a couple of their publications. The monthly Communications of the ACM (usually abbreviated as CACM) was the one I found most useful; the quarterly Journal of the ACM (JACM) was highly theoretical, and (for me) entirely inscrutable; and the quarterly ACM Computing Surveys (CS) was useful, but a little too elementary. During the next 33 years, I carefully filed each CACM, and took great pride in the fact that the accumulated issues began to fill several shelves of my office bookshelf. Of course, that didn’t mean that I actually read every article in every issue; but a few of the articles — such as “Exploratory experimental studies comparing online and offline programming performance,” by H. Sackman, W. J. Erikson, and E. E. Grant (CACM, January 1968) — remain classics even today, 40 years later.
In 1998, I moved to a new apartment. And even though the new apartment was only two blocks away, there was a great deal of house-cleaning: I threw out a lot of obsolete computer books — and after a great deal of hesitation and soul-searching, I threw out my entire 33-year collection of CACM, JACM, and CS issues, along with a few bookshelves filled with computer-industry trade journals. After all, the ACM had announced that they were all available on CD-ROM, and a rudimentary online service meant that even the CD-ROM probably wasn’t all that necessary. Equally important, I had to admit to myself that I never had read every article in every issue, and it was only on occasion that I searched through all of the hard-copy issues for a particular article that I needed for some work project.
With this in mind, it was a bit of a shock to see a headline, just a couple days ago, that said, “ACM Digital Library Now Exceeds One Million Entries; Online Computing and IT Publications Provide Rich Resources for Computing Professionals.” Holy cow, Batman: a million articles — all online! Not only do I throw out every hard-copy technical journal within days of receiving it these days, but I’ve also thrown out almost all of my technical computer books; to understand why, take a look at my August 12, 2006 blog posting, “Fahrenheit 451 Revisited: Do We Need To Keep Our Technical Books Any More?”
All of this is so obvious to members of the college/young-adult generation that it hardly warrants any discussion at all. But it’s important to ask ourselves: where do we go from here? One could argue that the primary improvement between 1965 and 1998 was that technology allowed us to search more efficiently through documents we already had available in hard-copy form. But the next improvement, which first appeared about a year later, was the Google-style search mechanism that allowed us to search through billions of documents that we never previously had access to, in hard-copy form, and — more important –documents whose existence we had been unaware of. To some extent, that latter description applies even to the new ACM digital library of one million documents.
So, with that as a foundation, what should the next generation be expecting to do? I think it will become easier and easier to search for answers (in the form of relevant articles, books, Ph.D. theses, etc.) for questions we have already formulated. The hard part will be deciding what questions to ask. Google can’t help us decide what problems need to be solved, and what areas of computer science (or any other field of human endeavor) need to be explored.
The other thing I hope our new generation of computer researchers will focus on is relationships between bits and pieces of existing knowledge. It’s great that the ACM can now provide us with a million different articles about computer science — but how are they related to one another? It’s easy enough to pull out subsets of articles, based on author or a Boolean combination of keywords and tags; but I think we need some kind of “six degrees of separation” analysis that will give us more insight into the fact that topic A (and all articles associated with A) have a connection to topic B, which is in turn related to topic C, and so forth. Serendipitous combinations of apparently-unrelated concepts may well be the source of the next generation of technological breakthroughs.
Maybe this is all part of the “semantic Web” that underlies the forthcoming Web 3.0. Maybe it’s the basis of the next generation of Google. Maybe it’s all a wild dream on my part. But in the meantime, my bookshelves are empty, and it’s been well over ten years since I had an IBM Selectric typewriter in my office…

February 18th, 2007 at 5:14 pm
Ed -
You put your finger right in the middle of one of the great remaining challenges… over a period of years, as terminology mutates (COPYbook goes out of fashion and #include comes into fashion), how do I search for a term in 2005 across relevant articles written in 1965.
In 1965 COPYbook might have been considered a bleeding edge concept, likely worthy of scholarly articles. Today, no one would want to admit they know what a copybook is. Worse—and the real point—the newbies will be left with the belief C or Java invented #include in the 1980s or 90s.
The challenge becomes how to translate constantly evolving popular vocabulary across the static, print media. The concepts are relatively constant and static but the commercial & fashion drive vocabulary appears to be constants discovering “new” things.
The Google effect is indeed amazing but it only returns a very shallow slice of “knowledge.” Don’t know the right terms and you entirely miss what you THINK you’re looking for.
A major challenge, particularly since I believe Google had done bad things to the abstracting profession.
- David
February 19th, 2007 at 3:22 am
I read this article with great amusement . And unfortunately for David\’s post, I happen to know what a copybook is and just left a position managing some folks that know all too well what that means.
It begs the question that is probably at the core - do you learn by doing web searches or do you learn by reading a wide range of material (books + current events…plus always reading contrary points of view)?
I think we all know the answer but it will remain to be seen if our educators - and parents (including me) - can keep the challenge and the right blend of high technology and old technology \