So Just How Much Faster Are Computers Today?
I purchased my first computer, an Apple ][+, in 1983. I paid US$2,000 for it. At the time, it was a high-end PC (the term personal computer was just then coming into use) with two 5.25” floppy drives, a 6502 microprocessor, and 48 kBytes of RAM. There was no hard drive. I wrote everything in assembly language, including a complete set of floating point math routines. I developed my own format for single precision floating point numbers. After I had completed the project, I discovered that my custom format had actually turned out to be identical to the standard IEEE floating point format! So now I know why a floating point number has its exponent represented in something called “excess binary” rather than regular binary or two’s compliment... I had figured it out the hard way. (This allows a true zero to have a zero exponent, which is less than the exponent of any non-zero number, among other reasons.)
My second computer was a Commodore 64, again about US$2,000. I did not get a floppy drive because it was bulky and expensive. The computer had an extraordinary full 64 Kbytes of RAM. With some changes in the memory map, like different IO locations, I would assemble the software on my Apple and then just copy the binary over to the Commodore using an RS-232 serial link, for which I wrote the drivers on both ends myself. To ship software to customers, I recorded it on to an audio cassette tape.
I have fond memories of those days, and many late nights, slaving away on the computer as I wrote an antenna analysis program. However, I never did a matrix solve on those first two computers. That would wait for my third computer, which I acquired in 1985. One of the first IBM-PCs, it also cost about US$2,000, and it also had two floppy drives and no hard drive. There was a whopping 640 Kbytes of RAM, and it ran at the blazing speed of 4.77 MHz. (That’s right, MHz, not GHz!) In 1985, this was an incredible speed. After all, that was at a higher frequency than my first amateur radio contact, 3.726 MHz.
I also splurged for a US$50 copy of Borland Turbo Pascal, so I could turn out software a little faster. It was nice to not to have to buy Microsoft’s US$400 equivalent because I was a graduate student, with a mortgage, a wife, and a new baby. Expenses were very carefully considered back then.
I was working on my dissertation on the Method of Moments under Prof. Roger Harrington, who first fully described and unified the technique. As part of the Method of Moments, we must invert a matrix. The algorithm usually used is called LU (for “Lower-Upper”) decomposition. The first IBM-PCs used the Intel 8088 microprocessor. It had a 16-bit internal data bus, but was 8 bits externally, thus reducing cost. For speeding up math operations, I had also purchased the optional 8087 co-processor. It was a separate chip (today the co-processor and CPU are on the same chip). While mathematical operations are being performed in the 8087, the address of the next operands can be calculated in the 8088. So I wrote the inner loop of the LU decomposition routine in assembly language for maximum speed.
Flash back to the late 1960s and early 1970s. Roger Harrington is placing the Method of Moments into its modern, unified form. However, some of his attempts to publish are met with skepticism by reviewers. One reviewer comment, which Prof. Harrington told me about, went something like this, “All your work on numerical electromagnetics is useless because it has been proven that it is impossible to invert even a 100 × 100 matrix... because the magnetic tape would wear out going back and forth.” Very funny today, but back then it was probably a true statement. It is just that the reviewer did not anticipate that things might change in the future. We should all be careful we do not make similar mistakes today when we review papers.
My IBM-PC had sufficient memory to store a 100 × 100 matrix. It took about one hour to invert. Prof. Harrington was pleased.
When I first published my work with Method of Moments, I was likewise told by a respected microwave engineer that all this numerical electromagnetics was ivory tower academic stuff, completely useless to the practicing microwave engineer. At the time, he was right. A 100 × 100 matrix was about the largest that we could handle. One hundred subsections is not big enough to do any more than a couple simple discontinuities. At this time, I also had to make the decision to continue on to commercialization or drop the idea and get on with life.
It was discouraging. Matrix solve is an N3 process. That means we would need a computer eight times faster to do 200 subsections in one hour, as we would need to invert a 200 × 200 matrix. Would we ever see a computer with, my gosh... a 40 MHz clock rate? Maybe in five years? Maybe ten? How about maybe never.
In spite of the gloomy prospects, I decided to go on with commercialization. Gradually, computers got faster. We started making sales. We could move into office space, hire employees. The past quarter century has turned out to be an incredible ride.
So how much faster are computers today? Late in 2010, my son built a liquid-cooled dual hexa-core computer with 24 GBytes of RAM. It has two hard drives... and no floppy drives! The cost is about US$3,000. I can now invert a 100,000 × 100,000 matrix in just under three hours.
Given that matrix solve is an N3 operation, this means that hardware and software today, as applied to the same matrix solve technique, is now over 300,000,000 times faster than in 1985. Who would have ever guessed? Not in my wildest dreams. Today is a wonderful time to be involved in the field of applied numerical electromagnetics.