The History of Programming

Programming has always been guided by various methodologies. In the early days of computers, computer memories were quite small. Programs had to be loaded in by toggling switches on a panel. In these days it was possible for a programmer to keep track of every memory location and every machine instruction in his or her head. Since computer memories were so small (often just a few hundred bytes) and the machines so slow, program efficiency was the primary concern. Any program was acceptable as long as it worked. Algorithms were very closely tied to the capabilities of the specific machine they ran on. This is called machine language programming. The toggling of individual memory locations (by switch or other means) is called a first-generation language, and we’re being very liberal with the definition of language. In a first generation language there is almost no abstraction.

As computers grew in power and memory, it was no longer possible for a programmer to keep track of what was happening at every location in the machine’s physical memory. Card readers and assembly language were invented to make programming more feasible. In assembly language the programmer uses mnemonic codes like MOV to represent particular bit sequences. These codes mapped directly to individual instructions on the CPU, and memory was still addressed directly. One code meant exactly one CPU instruction. (More modern assembly languages don’t always map as directly to the CPU as the older ones did.) Algorithmically The philosophy of “Use whatever works” continued.

Assembly language was still a bear to deal with, especially as related to arrays and storage in memory. Therefore the first high-level programming language, Fortran, was invented to spare programmers from the pains of dealing with keeping track of the location of their variables in memory. (It’s interesting to note that this lesson has had to be learned again and again and again. The buggiest parts of C and C++ programs result from programmers being allowed to access arbitrary bytes of memory. Java has wisely removed this capability. 99 times out of a 100 you don’t need it. A large part of training a C or C++ programmer to use Java, consists of convincing them of this fact.). Fortran was the first example of a third-generation language. In a third generation language you tell the computer the algorithms and data structures it should use to calculate the results you want; but you use more abstract logical and mathematical operators rather than directly manipulating addresses in memory and CPU instructions. In a third generation language, statements represent several machine instructions. Which instructions they represent may even depend on their context.

These languages may be compiled or interpreted. In either case your program code needs to be translated into equivalent machine instructions. This level of abstraction made considerably more powerful algorithms and data structures possible.

Java is a very advanced third generation language. Most of the other computer languages you’re probably familiar with, Fortran, Basic, C, C++, Cobol, Pascal, as well as most of the one’s you’re not familiar with (AppleScript, Frontier, Eiffel, Modula-3, ADA, PL/I, etc.) are also third-generation languages (or 3GL’s for short).

When third generation languages were invented, they were supposed to make computers so easy to use even the CEO could write programs. This turned out not to be true. Fourth generation languages (or 4GL’s for short) moved the abstraction level a step higher. In these languages you tell the computer what results you want rather telling it how to calculate those results. For instance you would ask for the total sales for the year, without specifying the loops necessary to sum all the sales of all the salespeople. SQL is the most popular fourth generation language.

Of all these languages there’s no question that 3GL’s have been the most successful by almost any measure. A number of different styles of 3GL programming and programming languages have sprung up, most learning from the experience and mistakes of its predecessors. Fortran (and its cousin Basic) were the first. They shared with assembly language an attitude of “Whatever works, no matter how ugly.” They had limited flow control (essentially for loops and goto statements) and one data structure, the array. All variables were global and it was impossible to hide one part of the program from any other. Although it was possible to write maintainable, legible code in these languages, few people did.

Pascal and C were the next widely successful languages. They made possible a style of programming known as structured programming. Structured programming languages have many different flow control constructs (switch statements, while loops, and more) as well as tools for more complicated data structures (structs, records and pointers). Goto is deprecated in structured programming though not eliminated entirely. (It is still necessary for some error handling.) Finally they have subroutines with local variables that are capable of splitting the code into more manageable and understandable chunks. These languages proved more capable of writing larger, more maintainable programs. However they too began to bog down when faced with the need to share code among programmers and to write very large (greater than 50,000 line) programs.

Some of the above history may sound a little funny to those of you with experience in the languages I’m discussing. After all Basic has subroutines and local variables, doesn’t it? The fact is successful computer languages have continued to evolve. Fortran now has pointers so it can create more complicated data structures. Basic has while loops. Cobol has objects. And on some architectures like Alpha/VMS the assembly language bears little to no resemblance to the underlying machine architecture. These features were not parts of the first versions of the language, however. And despite these improvements the modern versions of these languages are their parents children. Basic and Fortran programmers still often produce spaghetti code. Assembly language is quick to run but long to program. C is obfuscated beyond the comprehension of mere mortals.

The third generation of 3GL’s (3.3 GL’s) began to take hold in the late 80’s. These were the object oriented languages. Although object oriented languages had been around since the late 1960’s, it wasn’t until the late 80’s that computer hardware became fast enough and memory cheap enough to support them. (Object oriented programming is not a panacea. It exacts a speed penalty over plain vanilla C or Fortran code, and often requires twice as much memory.)

Object oriented languages (OOP for short) included all the features of structured programming and added still more powerful ways to organize algorithms and data structures. There are three key features of OOP languages: encapsulation, polymorphism and inheritance. All of them are tied to the notion of a class.

Comments are closed.