Please don't take it badly Paulo, it's only intended as constructive criticism.
Why is Logtalk not written in Logtalk?
The Logtalk compiler is written in Prolog and is about 16,000 lines of source. It is not "programming in the small", so why isn't Logtalk written in Logtalk?
There's a principle called "eating your own dogfood". By using a tool, you demonstrate the usefulness of the tool.
It is often asked, "if the compiler for language X isn't written in language X, why should I risk using it?" 
Well, I'm certainly "eating my own dogfood". I'm using Logtalk to develop non-trivial applications and helping others do the same.
Bootstrapping is used in most modern languages because it makes it possible to use the language features introduced by the new language.
Logtalk has introduced good support for "programming in the large". Since the Logtalk compiler is 16,000 of lines source code and is a large monolithic system it is clear that this is no longer "programming in the small". It would clearly benefit from decomposition, separation of concerns, re-use, object orientation, portable libraries, etc.
- Monolithic code. At present there is a single developer and there are few (any?) contributions from others. This is partly because the code is monolithic and hence understanding the compiler is hard. There is no separation of concerns, no modular programming. Predicate interdependencies are not delineated. Navigating a 16,000 line file is cumbersome.
Actually, there are a sizable number of contributions: bug reports, feature requests, general feedback and discussion, and also some code contributions (for example, in a folder named "contributions" in the current Logtalk distribution). Not to mention the applications, thesis, and papers where Logtalk plays a significant role. These are important contributions to one of the most important assets of a programming language: its user community.
Writing a compiler is neither "programming in the small" or "programming in the large". Writing a compiler is, at least in the specific case of Logtalk, programming at a low-level. This is a key difference. Some people, usually outsiders not familiar with Logtalk, think 16000 lines of code (well, not exactly but I will talk about that in a moment) is a large number of lines for adding objects to Prolog (I know that's not your case). Most of the time they are oblivious of the Logtalk feature set. Personally, I'm always amazed about how small the number of lines of code I wrote in order to implement Logtalk. Prolog is a wonderful, compact, and expressive language.
But let's look at current Logtalk development version (r5116) and do some counting on the "compiler/logtalk.pl" file that implements both the Logtalk compiler and the Logtalk runtime:
Total number of lines: 16019
Empty (layout) lines: 3799
Lines containing only comments: 1414
Therefore, the total number of lines of *code* is 10806. Well, actually is a bit lower as not all lines that are essentially comments are being accounted above. How complex are these 10806 lines of code? Most of them are trivial code. Just to give you some example:
Operator and predicate directives: 77
Tables (facts) of e.g built-in predicates and functions: 453
Number of lines for printing Logtalk banner and flag values: 123
So, 653 lines of trivial code. We are down at 10153. Still a lot you might say. Let's proceed. Not trivial but quite simple code:
Writing out XML documentation files: 798
Error-checking predicates: 223
Still 9132 lines of code. Things that you don't need to care about in order to understand the essential bits about either the compiler or the runtime:
Core high-level multi-threading implementation: 436
Definite Clause Grammars (DCGs) implementation: 416
Now at 8280 lines of code. This is half the number of lines (16000) that you talk about repeatedly in your post. Are these remaining lines of code easy to grasp? Yes, provided that you take the time to read my PhD thesis. Some things, of course, are not there. An example is the dynamic binding caching mechanisms. Moreover, the difficult bits are getting better documented at each release.
Predicates have ugly names. The following is a typical predicate name in the Logtalk compiler:
Code: Select all
This is due to two reasons:
1) Compiler predicates are prefixed with a $ to denote a system predicate, and the single quotes also exist for the same reason.
2) Part of the predicate name is a namespace that compensates for the lack of modular programming.
Both (1) and (2) are correct. Allow me to add two bits of additional information. Some Prolog compilers, not all but a fair number, automatically hide (e.g. from the debugger) predicates whose name start with the "$" character. Regarding "the lack of modular programming", in a dynamic language such as Logtalk, where you can do essentially everything at runtime, you cannot separate the compiler from the runtime.
A more elegant name would be
Code: Select all
or better still, with object orientation
Code: Select all
The compiler back-end should be able to take runtime::error_handler(...) and compile it to '$lgt_runtime_error_handler'(...)
Why? The requirements to generate an extensible compiler and an efficient runtime are not the same as the requirements to compile user applications. For example, the runtime tables and entity tables necessary to support essential features such as inheritance are basically irrelevant to the implementation of either the compiler or the runtime.
This change in predicate naming would significantly improve readability of the code. Good predicate naming together with decomposition into objects, would make the source far more accessible.
Predicate (and variable) naming in the implementation of the Logtalk compiler and runtime is top notch if you forget for a moment the necessary quotes and the $lgt_ prefix. In case of doubt, I invite you to browse the source code of most open source Prolog compilers.
[*] Documentation. There is little documentation for the Logtalk compiler. Although Paulo's PhD thesis describes the internals it has not been updated since 2003 while the compiler has continued to evolve. It is not clear how up to date the PhD thesis is with respect to the current Logtalk implementation. Logtalk provides support for automatic documentation generation and would allow the compiler's documentation to evolve and stay up to date.[/list]
Logtalk automatic documentation generation cannot fulfill the requirements of describing the design decisions and algorithms used in the Logtalk implementation. Logtalk automatic documentation generation is a tool for documenting entity and predicates from an interface perspective.
Logtalk provides a host of great new features for developing large applications. The compiler is an ideal candidate for exploiting these features.
No, it's not. As I wrote at the beginning, the Logtalk compiler/runtime is not a large application, is a low-level application. Quite different beasts.
They would help improve the Logalk language by making it more readable and provide improved documentation. Furthermore, the code would be more maintainable; it would be easier to spot bugs and the barrier to entry would be significantly lower thus allowing others to contribute.
You fail to mention that bootstrapping is not trivial to implement (and I'm talking from experience). It would set back Logtalk development for a large period of time. It would rise new classes of bugs and it would force the Logtalk compiler to target two separate sets of requirements, those of the Logtalk compiler/runtime itself and those of the user applications, adding complexity to a base code that I strive to refine and simplify.
Writing the Logtalk language in Logtalk is a sign of maturity and is the natural next step in the evolution of this language.
It would be a nice academic exercise, sure. From a pragmatic perspective it would a big waste of the currently limited resources available for developing Logtalk. Bootstrapping sound nice on paper but, in the case of Logtalk and, I suspect, of most programming languages, it would be a diversion. Good for bragging but of little impact in users lives. I rather spend my limited time e.g. implementing a good cross-referencer. Nevertheless, thanks for opening an interesting discussion.