You can call it John's law :-
All advances in computing have arisen through the creation of an additional level of indirection, the trick is to work out which indirection is actually useful.
I have also been putting a lot of other features of programming languages to the following test. "Does the feature force the programmer to specify the mechanism or the semantics?" My idea here is that the programmer should say what he means, and not tell the compiler how to do its job.
For example :- parameter passing mechanisms. If the programmer has to choose whether he passes an array by reference or by value, he is specifying the mechanism. If the programmer states that the parameter is input only, or output only or will be modified, he is specifying the semantics, and the compiler is free to choose the most appropriate mechanism to ensure the semantics are met efficiently.
I believe this approach will lead to a much more elegant language. For example, take the classic confusion and mess surrounding the humble assignment.
When we say "a := b;" we have a mess of semantics in the various languages.
In most languages the semantics for integer and floating point types is that the contents of b is copied into a.
Consider this piece of Pascal :-
temp := b; b := a; a := temp;
Here we mean to swap the labels on the objects in a and b. The mechanism we use is copying, what we actually meant was that the compiler should no longer consider 'b' to refer to what it was, but to refer to what 'a' was referring to. And vice versa.
Thus looking at the average program, it is certainly not clear what the implications of an assignment statement is. Not only does it often depend on structure/non-structure item context, but also on the fine implementation details of the object. If the object is some form of container, the assignment may mean shallow or deep copy, depending on the details.
I propose that copy ALWAYS means deep copy. I also propose a new assignment-like operator which I will call "handover". Instead of copying the value of 'b' to a, it will handover the object in 'b' to 'a'. Thus 'b' loses its binding and the 'a' takes it over. Thus the swap becomes
b handover to temp; a handover to b; temp handover to a;
At the end of this sequence, 'temp' is unbound and it is illegal to refer to it.
Now please be patient and see where this takes us. It takes us into a world were memory errors disappear and garbage collection comes for free.
Consider pointers. Pointers are integer labels/names of objects that exist in a global namespace. Just as 'temp' is a string of characters providing a means of referring to an object, a pointer is an integer that provides a means of referring to an object.
Think of this in terms of my semantics vs mechanism criterion. A C style pointer specifies the mechanism of a direct memory address. I say we should be concerned with the semantics. The semantics is we wish to be able to manipulate an integer name for an object.
Pascal attempts to safeguard this be removing the "integer" semantics. In C you can perform very useful, but unsafe integer arithmetic on pointers. Pascal bans this.
Nonetheless both Pascal and C pointers are labels in a GLOBAL namespace. A pointer created anywhere in the program is valid everywhere else.
Now any programmer worth his salt keeps the number of global variables to a minimum, because they are :-
Yet practically all languages force us to use pointers which are global integer names.
In addition to this, pointers create aliases to objects. In the presence of pointers, objects can have a myriad of different names. Thus it is highly unsafe to dispose of a object as it is highly non-trivial to establish whether it is still in use or not. Some languages provide sophisticated garbage collection mechanisms to cope with this.
I propose to resolve these problems by imposing the well-known database rule. "One fact in one place." I propose that all objects should be atomic in the sense that they can only be in one place, accessed via one name at a time.
This will be made possible by the "assignment is deepcopy" rule and the "handover" operator. There will be no address-of operator. There will be no way of creating an alias.
Now a lot of highly useful algorithms make sophisticated use of pointers, and at first sight it would seem as if I have lost a vast amount by imposing this restriction. I will now show that I have lost nothing and in fact gained garbage collection for free.
Clearly the language should provide some form of indexed container. (An array being the archetype). Now the "one object one place" rule will insist that if we place an object into a container, we "handover" the object to the container. Thus the object is now in the container and is no where else. The handover operation ensures that the variable holding the object previously, is now unbound, and the object is no longer accessible via that route. The only way of operating on that object is via the container.
Now the integer index of the object in the container provides all the functionality of a traditional pointer, but in a local, not global namespace.
However, and this is the cute bit, we need to know the name of the container and the integer index to access our object. Thus as soon as we move out of scope for that container, the container AND ALL THAT IT CONTAINS, becomes garbage and can be collected.
To make this a bit more concrete. Think of a C pointer dereference operator. It is the unary operator *. In my scheme it becomes a binary operator operating on the pointer and some container. In C the container is implicitly the global memory container.
At first sight you may say, "Oh no! He has come up with the old reference counting idea, and we all know that doesn't work because of circular references."
To a certain extent yes, but in fact I have come up with something more draconian that that. I insist that the reference count MUST be forced to be 0 or 1. This creates a different realm for the programmer. It some senses it is more limiting, but in fact it can be "godelized" to be as flexible as any other language. ("godelized", after Kurt Godel, who mapped symbolic logic to the natural numbers to create meta-proofs of mind-boggling implications. Indexed containers permit "godelization" of objects.)
What this rule does do, is create an object world that is far more like the familiar spatial universe we ourselves exist in. I can be in my office or I can be at home, I can never be in both. Thus when writing programs under this draconian constraint I am operating in an environment similar to the every day one.
The only class of memory leak that still remains is if the programmer is stupid enough to put the container in it self... (Ye olde circular reference, but with the extra spin that the container must disappear into itself!)
Hmm. This gives rise to a new name for this bug, the Klein bottle bug! (The Klein bottle is a bizarrity borrowed from topology, it is a bottle whose inside is the same side as its outside. You get one by taping the edges of two moebius strips together in a 4-dimensional space.)
Still, the compiler should be able to pick up most obvious occurences of Klein bottle formation, but I suspect in general it would still be possible.
Also the "this is in that" graph should be a nice tree it should be quite efficient to check for such an event. Maybe one can do some form of garbage collection sweep on program exit and find if you have Klein bottles lying around and if you do have Klein bottles, you can switch on checks for bottle formation next time you run the program.
There are also situations where some form of garbage collected container could be useful, but at least this scheme can allow the garbage collection process to work in a much smaller domain.
The implications of this scheme in the presence of multi-threading have not been thought out properly. I suspect that it will result in far fewer race conditions, but the I still need to work on the meaning of 'one object one name' in the presence of multithreading.
As far as I can see the main claims to fame of the functional languages are the following two points :-
All functional languages I have seen so far, have destroyed the tidy flow of cause and effect by allowing functions with side effects.
Thus in my language I propose regain the original purity by enforcing side effect free functions. For efficiency sake, procedures will be available for operations requiring side-effects, but at least one will be secure in the knowledge that all expressions are side effect free.
Note :- there is no reason why one cannot use efficient side effect creating operations like assignment within a side effect free function, if and only if those effects are local to the function and disappear on exit of the function.
Lazy evaluation, as I understand it, basically comes down to "call-by-need" parameter passing. For a long time I have resisted the idea of "call-by-need", as it triggers the evaluation of a piece of code, (the calling parameter expression), at an arbitarily later time (the first parameter reference). This, I felt sure, would create ample room for all kinds of very nasty bugs. However I can also see that certain computations would be greatly facilitated by Lazy Evaluation.
The resolution of the dilemma is to restrict Lazy evaluation to parameters of my "pure" functions. If "call-by-need" was allowed in procedures with side-effects, the results of the parameter expressions could be altered by side-effects in the procedure AND side-effects in the evaluation of the parameter expressions themseleves. This would make for incredibly hairy flow of cause and effect. If lazy evaluation was restricted to side-effect free functions, then one is gauranteed that the result of evaluating the parameter expression would be the same no matter when it is actually performed.
Two things about what I have said above affect the concept of parameter passing. One is the existence of pure functions, and the other is the concept of 'one object, one place'.
The 'one object, one place' rule would suggest that parameter passing be a 'handover' operation. In fact it must be one of the following options :-
The requirement for pure functions would be met by 'hand in and forbid any changes and then hand out'. This would be the only semantics allowed to function parameters. Note that the compiler need not formally go through the work of unbinding and rebinding, the compiler is free to choose the exact mechanism, so long as the semantics hold.
For procedures parameters can be one of :-
The operation of placing an object into a container would require a "hand in" parameter. For example :-
myStack.push( a); a.doSomething; // This is illegal as a is now unbound.
Note the difference between Input only and call-by-value. Call by value, is equivalent to shallow copy and hand-in. I object to this on two grounds :-
I could get around objection 2 by making call-by-value deepcopy and hand-in, but objection 1 would still remain.
Comments, queries and conversation.