A computer program is a collection of instructions, or statements (also
called code) carried out by the computer's CPU. These instructions
can be written in many different languages. Except for operating systems and
utilities, all the categories of programs listed in the preceding section
are application programs. These programs are developed to help users solve
problems or perform tasks, such as sending mail messages, editing text, or
finding a specific article in a library.
Application programs are usually composed of many files.
Some of these files contain instructions for the computer, whereas other
files contain data. On DOS- or Windows-based PCs, some common extensions for
program files are .exe (execu-table), .dll (dynamic link library.), .ini
(initialization), and .hlp (help). These extensions define the type of file.
By default, most pro-gram files are stored in the folder
that bears the application's name or an abbreviation of it. To view a list
of the files needed to run an application, you can open that application's
directory or folder (see Figure 11.12 for an example).
The system software that controls your computer also includes many files.
Program Control Flow
Although all the files in the program's folder are considered parts of
the program, usually one file represents the core. This file is often called
the executable file because it is the one that the computer executes when
you launch the program. On PCs running DOS or Windows, the executable file
has the same name as the program (or an abbreviation of it), plus the
extension .exe.
When you launch a program, the computer begins reading and carrying out
statements at the executable's main entry point. Typically, this entry point
is the first line (or statement) in the file, although it may be elsewhere.
After execution of the first statement, program control passes, or flows, to
another statement, and so on, until the last statement of the program has
been executed. Then the program ends. The order in which program statements
are executed is called program control flow.
The example in Figure 11.14 shows the flow of a small program that
controls a furnace. The program constantly checks the thermostat setting and
current temperature. If the current temperature is at or above the
thermostat setting, the program executes the statements that turn off the
furnace. If the current temperature is below the thermostat setting, the
program executes the statements that turn on the furnace.
Variables
One element that most computer languages have in common is varlables--placeholders
for data being processed. For example, imagine you are writing a program
that prompts users to enter their ages. You need a placeholder, or variable,
to represent the data (different ages) they enter. In this case, you might
choose to name the variable Age. Then, when a user enters a number at the
prompt, this data becomes the value of the variable Age.
Just as in algebra, you can use variables in programs to perform actions
on data. For example, consider the instruction:
Age + 2
If the value of Age is 20, the result of this instruction is 22; if the
value is 30, the result is 32; and so on.
Algorithms and Functions
Algorithms are the series of steps by which problems are solved.
Algorithms have been worked out for solving a wide range of mathematical
problems, such as adding numbers or finding a square root.
Algorithms represent solutions to problems. The steps to the
solution (instructions) remain the same, whether you are working out the
solution by computer or by hand.
Functions are the expression of algorithms in a specific computer
language. You reuse functions when you need them. You do not have to rewrite
the lines of code represented by the function each time. For example:
sqrt(x)
is a way of referring to the square root's function. The (x) after the
name of the function is the argument. You use arguments to pass input to
functions as the program runs. In this example, x is a variable representing
a number. If x is equal to 12, then the function will find the square root
of 12. After the function finds the square root, it returns this value to
the program. You can then use the value in other calculations.
Different computer languages use different names, such as subroutines,
procedures, and routines, for these blocks of reusable code. There are
subtle differences between these terms, but for now, the term "function"
will be used for all of them.
TW0 APPROACHES TO PROGRAMMING
Until the 1960s, relatively little structure was imposed
on how programmers wrote code. As a result, following control flow through
hundreds or thousands of lines of code was almost impossible. Programmers,
for example, often used goto statements to jump to other parts of a program.
A goto statement does exactly what the name implies. It identifies a
different line of the program to which control jumps. (It goes to.) The
issue with goto statements is identifying how program control flow proceeds
after the jump. Does control return to the jumping-off place, or does it
continue at the new location?
Structured programming evolved in the 1960s and 1970s. The name refers to
the practice of building programs using a set of well-defined structures.
One goal of structured programming is the elimination of goto statements.
Software developers have found that using structured programming results
in improved efficiency, but they continue to struggle with the process of
building software quickly and correctly.
Reuse is recognized as the key to the solution. Reusing code allows
programs to be built quickly and correctly. Functions, which are the
building blocks of structured programming, are one step along this path. In
the 1980s, computing took another leap forward with the development of
object-oriented programming (OOP). The building blocks of OOP, called
objects, are reusable, modular components. Experts claim that OOP will be
the dominant programming approach through at least the end of the 1990s.
OOP builds on and enhances structured programming. You do not leave
structured programming behind when you work with an object-oriented
language. Objects, for example, are composed of structured program pieces,
and the logic of manipulating objects is also structured.
Structured Programming
Researchers in the 1960s demonstrated that programs could be written with
three control structures:
Sequence structure defines the default control flow in a
program. Typically, this structure is built into programming languages. As a
result, unless directed otherwise, a computer executes lines of code in the
order in which they are written. Figure 11.15 shows a flowchart of this
sequential flow. The commands in the rectangles represent two sequential
lines of code. Program control flows from the previous line of code to the
next line. The commands are written in pseudo-code, which is an informal
language programmers use as they are working through the logic of a program.
After the command sequence is developed, the programmers translate the
pseudo-code into a specific computer language.
Selection structures are built around a condition
statement. If the condition statement is true, certain lines of code are
executed. If the condition statement is false, those lines of code are not
executed. The two most common selection structures are: If-Then and If-Else
(sometimes called If-Then-Else). Figures 11.16 and 11.17 illustrate these
types of structures.
Repetition (or looping)structures are also built around
condition statements. If the condition is true, then a block of one or more
commands is repeated until the condition is false. The computer first tests
the condition and, if it is true, executes the command block once. It then
tests the condition again. If it is still true, the command block is
repeated. Because of this cycling, repetition structures are also called
loops. Three common looping structures are: For-Next, While, and Do-While.
Figures 11.18-11.20 illustrate these three looping structures.
Object-Oriented Programming
Concepts of object-oriented programming, such as objects and classes, can
seem abstract at first, but many programmers claim that an object
orientation is a natural way of thinking about the world. Because OOP gives
them an intuitive way to model the world, they say, programs become simpler,
programming becomes faster, and the burden of program maintenance is
lessened.
Objects
Look around you--you are surrounded by objects. Right
now, the list of objects around you might include a book, a computer, a
light, walls, plants, pictures, and so forth.
Think for a moment about what you perceive when you look at a car on the
street. Your first impression is probably of the car as a whole. You do not
focus on the steel, chrome, and plastic elements that make up the car. The
entire unit, or object, is what registers in your mind.
Now, how would you describe that car to someone sitting
next to you? You might start with its color, size, and shape. A car, like
all objects, has attributes. You might then talk about what the car can do.
It can accelerate from 0 to 60 mph in 9.2 seconds, for example, it turns on
a dime, and so forth. Again, like all objects, a car has certain things it
can do, or functions. Together, the attributes and the functions define the
object. In the language of OOP, every object has attributes and functions
and encapsulates other objects.
When you look more closely at the car, you may begin to notice many
smaller component objects. The car, for example, has a chassis, a drive
train, a body, and an interior. Each of these components is, in turn, made
up of other objects. The drive train includes an engine, transmission, rear
end, and axle. An object, then, can be
either a whole unit or a component of other objects. Objects can include
other objects.
Classes and Class Inheritance
As you contemplate the objects around you, you will find that you
naturally place them in abstract categories, or classes, with other similar
objects. For example, the Porsche, Infiniti, and Saturn you see on the road
are all cars. In OOP, therefore, you would group these into a car class.
A class consists of attributes and functions shared by more than
one object. All cars, for example, have a steering wheel and four tires. All
cars can drive for-ward, reverse, park, and accelerate. Class attributes are
called data members, and class functions are represented as member functions
or methods.
Classes can be divided into subclasses. The car class, for
example, could have a luxury sedan class, a sports car class, and a pickup
truck class. Subclasses typically have all the attributes and methods of the
parent class. Every sports car, for example, has a steering wheel and can
drive forward. This is called class inheritance.
However, in addition to inherited characteristics, subclasses have unique
characteristics of their own. For example, pickup trucks have four-wheel
drive and trailer hitches, as shown in Figure 11.22.
All objects belong to classes. When an object is created, it
automatically has all the attributes and methods associated with that class.
In the language of OOP, objects are instantiated (created).
Messages
Objects do not typically perform behaviors spontaneously. After all, many
of these behaviors may be contradictory. A car, for example, cannot go
forward and in reverse at the same time. You also expect that the car will
not drive forward spontaneously either!
You send a signal to the car to move forward by pressing on the
accelerator. Likewise, in OOP, messages are sent to objects, requesting them
to perform a specific function. Part of designing a program is to identify
the flow of sending and receiving messages among the objects (see Figure i
1.23).
THE EVOLUTION OF PROGRAMMING LANGUAGES
Programming is a way of sending instructions to the computer. To be sure
that the computer (and other programmers) can understand these instructions,
programmers use defined languages to communicate. These languages have many
of the same types of rules as languages people use to communicate with each
other. For example, information must be provided in a certain order and
structure, symbols are used, and punctuation is often required.
The only language that a computer understands is its machine language.
People, however, have difficulty understanding machine code. As a result,
researchers first developed assembly languages and then higher-level
languages. This evolution represents a transition from strings of numbers
(machine code) to command sequences that you can read like any other
language. Higher-level languages focus on what the programmer wants the
computer to do, not on how the computer will execute those commands.
Hundreds of programming languages are now in use. These languages fall
into the following categories:
· Machine languages are the most basic of languages. Machine
languages consist of strings of numbers and are defined by hardware design.
In other words, the machine language for a Macintosh is not the same as the
machine language for a PC. A computer understands only its native machine
language--the commands of its instruction set. These commands instruct the
computer to perform elementary operations such as loading, storing, adding,
and subtracting. Ultimately, machine code consists entirely of the 0s and 1s
of the binary number system.
· Assembly languages were developed by using English-like
mnemonics for commonly used strings of machine language. Programmers worked
in text editors, which are simple word processors, to create source files.
Source files contain instructions for the computer to execute, but the files
must first be translated into machine language. Researchers created
translator programs called assemblers to perform the conversion. Assembly
languages are still highly detailed and cryptic, but reading assembler code
is much faster than struggling with machine language. Programmers seldom
write programs of any significant size in an assembly language. (One
exception to this rule is found in action games where the speed of the
program is critical.) Instead, they use assembly languages to fine-tune
important parts of programs written in a higher-level language.
· Higher-level languages were developed to make programming
easier. These languages are called higher-level languages because their
syntax is closer to human language than assembly or machine language code.
They use familiar words instead of communicating in the detailed quagmire of
digits that comprise the machine instructions. To express computer
operations, these languages use operators, such as the plus or minus sign,
that are the familiar components of mathematics. As a result, reading,
writing, and understanding computer programs is easier with a higher-level
language--although the instructions must still be translated into machine
language before the computer can understand and carry them out.
Commands written in any assembly or higher-level language must be
translated back into machine code before the computer can execute the
commands. These translator programs are called compilers. Typically, then, a
program must be compiled, or translated into machine code, before it is run.
Compiled program files become executables. The next section outlines a few
of the more important higher-level program-ming languages.
Higher-Level Languages
Programming languages are sometimes discussed in terms of generations,
although these categories are somewhat arbitrary. Each successive generation
is thought to contain languages that are easier to use and more powerful
than those in the previous generation. Machine languages are considered
first-generation languages, and assembly languages are considered
second-generation languages. The higher-level languages began with the third
generation.
Third-Generation Languages
Third-generation languages have the capability to support structured
programming, which means that they provide explicit structures for branches
and loops. In addition, because they are the first languages to use
English-like phrasing, sharing development between programmers is also
easier. Team members can read each other's code and understand the logic and
pro-gram control flow.
These languages are also portable. As opposed to the assembly languages,
programs in these languages can be compiled to run on multiple CPUs.
Third-generation languages include:
· FORTRAN (FORmula TRANslator) was specifically designed for mathematical
and engineering programs. The language, which enjoyed immediate and
widespread acceptance, has been enhanced several times, most recently in
1990. The current version is often referred to as FORTRAN-90. Because of its
almost exclusive focus on mathematical and engineering applications, FORTRAN
has not been widely used with personal computers. Instead, FORTRAN remains a
common language on mainframe systems, especially those used for research and
education.
· COBOL (COmmon Business Oriented Language) was developed in 1960 by a
government-appointed committee. Under the leadership of retired Navy
Commodore and mathematician Grace Hopper, the committee set out to solve the
problem of incompatibilities among computer manufacturers. Partly because of
the government's backing, COBOL won wide-spread acceptance as a standardized
language. Although COBOL had lost most of its fol-lowing over the past five
to ten years, the Year 2000 problem has required many COBOL programmers to
come out of "retirement" to help reprogram millions of lines of programs
written in COBOL to work after the year 2000.
· BASIC (Beginners All-purpose Symbolic Instruction Code) was developed
by John Kemeny and Thomas Kurtz at Dartmouth College in 1964 and started out
largely as a tool for teaching programming to students. Because of its
simplicity, BASIC quickly became popular, and when personal computers took
off, it was the first high-level language to be implemented on these new
machines. Versions of BASIC were included with early personal computers,
even before IBM PCs came on the market. Although BASIC is an extremely
popular and widely used language in education and among amateur programmers,
it has not caught on as a viable language for commercial
applications--mostly because it just does not have as large a repertoire of
tools as other languages offer. In addition, BASIC compilers still do not
produce executable files that are as compact, fast, or efficient as those
produced by other languages.
· Pascal was introduced in 1971 by a Swiss computer scientist named
Niklaus Wirth. Named after the 17th-century French inventor Blaise Pascal,
Pascal was intended to overcome the limitations of other programming
languages and to demonstrate the proper way to implement a computer
language. Pascal is often considered an excellent teaching language.
Beginners find it easy to implement algorithms in Pascal. In addition, the
Pascal compiler enforces rules of structured programming, thus ensuring that
errors are caught early. Because the compilers of other languages do not
necessarily enforce these rules, finding errors in other programs may
require a lengthy debugging process. Almost all early Macintosh applications
were written in Pascal. Lately, Pascal has become well known for its
implementation of object-oriented principles of programming but currently
does not have the following it once had.
· C, which is often regarded as the thoroughbred of programming
languages, was developed in the early 1970s at Bell Labs by Brian Kernighan
and Dennis Ritchie. Ritchie, with Ken Thompson, had also developed the UNIX
operating system. Kernighan and Ritchie needed a better language to
integrate with UNIX so that users could make modifications and enhancements
easily. Programs written in C produce fast and efficient executable code and
are portable. C is also a powerful language--with C, you can make a computer
do just about anything it is possible for a computer to do. Because of this
programming freedom, C has become extremely popular and is the most widely
used language among professional software developers for commercial
applications. The disadvantage of such a powerful and capable language is
that it is not particularly easy to learn.
· C++ was developed by Bjarne Stroustrup at Bell Labs in the early 1980s.
Like C, C++ is an extremely powerful and efficient language. Learning C++
means learning everything about C, and then learning about object-oriented
programming and its implementation with C++. Nevertheless, more C
programmers move to C++ every year, and the newer language is now replacing
C as the language of choice among software development companies.
· Java is a programming environment that creates cross-platform programs.
It was developed in 1991 by Sun Microsystems for TV set-top boxes for
two-way interactive cable systems. When the Internet became a popular
communications network in the mid-1990s, Sun redirected Java to become a
programming environment in which Webmasters could create interactive and
dynamic programs (called applets) for Web pages. Java is similar in
complexity to C++. Nevertheless, many programmers and computer professionals
are learning Java in response to the growing number of companies looking for
Java applications. In the future, Sun is hoping Java will be the de facto
programming environment, knocking off C++ as the number one programming
environment.
Fourth-Generation Languages
Fourth-generation languages (4GLs) are mostly special-purpose program-ming
languages that are easier to use than third-generation languages. With 4GLs,
programmers can create applications rapidly. As part of the development
process, programmers can use 4GLs to develop prototypes of an application
quickly. Prototypes give teams and clients an idea of how the finished
application will look and operate before the code is finished. As a result,
everyone involved in the development of the application can provide
feed-back on design and structural issues early in the process.
A single statement in a 4GL accomplishes a much more than was possible in
a similar statement from an earlier-generation language. In exchange for
this capability to work rapidly, programmers have proved willing to
sacrifice some of the flexibility available with the earlier languages.
Many 4GLs are database-aware, which means that you can build programs
with them that work as front ends to databases. These programs include forms
and dialog boxes for inputting data into databases, querying the database
for information, and reporting information. Typically, much of the code
required to "hook up" these dialog boxes and forms is generated
automatically.
Fourth-generation languages include:
· Visual Basic is the newest incarnation of BASIC
from Microsoft. VB, as it is often called, supports object-oriented features
and methods. With this language, programmers can build programs in a visual
environment. To place a box on a form, for example, Visual Basic programmers
simply drag the box from a toolbox onto the form, as shown in Figure 11.31
In other languages, the programmers would have to write code to specify the
exact place-merit of the box on the form, as well as its size. With Visual
Basic, a programmer places the box visually and then drags the edges of the
box with the mouse until it is the right size. The necessary code for the
box's placement and size is written automatically. Using this visual
environment, programmers find it easy to write programs quickly.
· Application-specific macro languages are built
into many applications. These languages give users the capability to write
commands and integrate applications. For Microsoft Excel, for example, the
macro language is Visual Basic for Applications (VBA). Using a spreadsheet
macro, you can write a sequence of commands to perform a task automatically,
such as bold every entry of more than $10,000 in a spreadsheet. Macros can
be created automatically or you can type in the macro yourself.
· Authoring environments are special-purpose
programming tools for creating multimedia, computer-based training, Web
pages, and so forth. One example of an authoring environment is Macromedia
Director (which uses the Lingo scripting language) that you can use to
create multimedia titles combining music clips, text, animation, graphics,
and so forth. Like Visual Basic, these development environments are visual,
with much of the code written automatically.