@q file: bugs.w@> @q% Copyright Dave Bone 1998 - 2015@> @q% /*@> @q% This Source Code Form is subject to the terms of the Mozilla Public@> @q% License, v. 2.0. If a copy of the MPL was not distributed with this@> @q% file, You can obtain one at http://mozilla.org/MPL/2.0/.@> @q% */@> @** Bugs in all their splender.\fbreak @*2 Error on ``file-overrun''.\fbreak Where the meta terminals `eog' or `eof' have no co-ordinates assigned to them and the error token being generated needs a real co-ordinte assigned to it. The |tok_canoperator []| did not respect the requested token subscript when the end-of-file was reached. It always returned the `eog' token. Now if the requested subscript |<=| the container's |pos_| the appropriate token is returned from that the associated error terminal will associate to the previous real terminal returned. The container is walked backwards looking for Mr. Right. \fbreak Jan. 1/2005.\fbreak \fbreak June 2008\fbreak ``eof'' has been end-of-the-line for \QUEshift. @*2 Parallel parse assumed that the grammar would do more{...}.\fbreak than just parse and accept a single \PARshift phrase. This showed up when I implemented a consolidated grammar to reduce the First set testing to launch threads. \fbreak Fix: replace |reduce| with |@|.\fbreak Jan. 1/2005. @*2 Parallel thread table aborts when program winds down.\fbreak This is a Microsoft problem as it's a simple template of map of thread strings and list of current threads available.\fbreak Jan. 1/2005. @*2 \ALLshift and end-of-container.\fbreak Ahh the Ides of March --- what do u do when the ``all shift'' facility is on and u reached the ``eog'' or ``eof'' token: the end of the container? Originally I turned off the ``all shift'' facility and returned without executing the |all_shift| procedure if present in the configuration state. Overruns in any context are not liked. Well an improvement to this situation is to turn off the facility and still execute the |all_shift| if its present in the state's configuration. This allows the grammar writer to use this facility as an error handler.\fbreak Mar. 15/2005. @*2 Test availability of |BIT_MAPS_FOR_SALE__|.\fbreak Finally getting around to refining the constraints by adding an extern indicating the total number of words for sale. When bit maps need generating --- just-in-time manufacture per fsm state calling threads, the global |BIT_MAP_IDX__| is the accrued number of maps already created. It is this value that is measured for overflow against |TOTAL_NO_BIT_WORDS__|. See |@| for implementation. A thrown error will be generated.\fbreak Apr. 10/2005. @*2 Monolithic grammar's |start_token| should be set in constructor.\fbreak This error showed up when a standalone grammar was calling out of its first set a thread that should have run and didn't. The grammar highlighting the error was properly programmed but used the |start_token| procedure as a reference to set the error token co-ordinates. This type of error means either Yacco2's Linker did not generate properly its first sets, or the grammar writer did not regen the first sets using Linker after adding or subtracting terminals from the Terminal vocabulary. Now one can set it 2 ways: by calling one of these procedures |start_token| or |current_token|.\fbreak May 10/2005. @*2 Mismatched file number associated with error token co-ordinates.\fbreak Well this is just a dumb error! Like all others.\fbreak History:\fbreak To support nested file includes, 2 globals were used: |FILE_CNT__| and |NESTED_FILE_CNT__| to be efficiently clever. How so? I did not want to push, pop, and pant a stack. As new files were being processed, their literal names were kept in a map: file number and its description. Of course this could be a vector but my file number starts from one due to my bias on counting; I'll stick with the bias but fiddle the vector after this. Now |FILE_CNT__| is an incrementing number while |NESTED_FILE_CNT__| is the nested level of includes. U guessed it the file number being associated with the error was the nested level and not |FILE_CNT__|. So just stack the |FILE_CNT__| at time of file processing and use the stack depth to guard against run away file recursion.\fbreak 19 May, 2005. @*2 Validate accept message against the new lookahead token position.\fbreak With experience, this reality check is not needed. Why? Error tokens can be returned from a thread with no consumption of the token stream occuring. The check came about when threads were being developed with the assumption that tokens returned consumed the current token stream which is not the case as one could post process tokens and forward post an error past the current token position or re-align the error outside of the token stream being read. Now with a more creative approach to error handling and threads working properly, this check is too restraining. So beware as it can still happen.\fbreak 26 May, 2005. @*2 Linux bug --- dropping namespace |yacco2::| on extern "C" referenced objects.\fbreak The |yacco2| namespace wrapped the below globals to manage threads. They get defined by Yacco2's linker. Now the shaker: originally I referenced these globals by using "C" extern. I used this approach to indicate that other languages could get a hold of them though the real use of extern "C" is for functions and the order of parameters pushed onto the calling stack. Unfortunately when porting |yacco2| to Linux these globals were not resolved by the regular language linker. The |wthread.cpp| code that referenced them compiled but emitted object code without the |yacco2::| prefix. \ptindent{1) extern "C" void* |THDS_STABLE__;|} \ptindent{2) extern "C" void* |T_ARRAY_HAVING_THD_IDS__;|} \ptindent{3) extern "C" void* |BIT_MAPS_FOR_SALE__;|} \ptindent{4) extern "C" int |TOTAL_NO_BIT_WORDS__;|} \ptindent{5) extern "C" int |BIT_MAP_IDX__;|} \ptindent{6) extern "C" |CAbs_lr1_sym|* |PTR_LR1_eog__;|} The fix: drop the "C" from the above extern statements. The object code now contains the |yacco2::| prefix to these globals.\fbreak 25 July, 2005. @*2 Why me the ginea pig using other \CPLUSPLUS/ compiler foibles?.\fbreak Linux ugh what's it good for? absolutely... as the song goes. The out-of-the-box \CPLUSPLUS/ compiler generates unreolved references that are due to its template processing. Going thru g++ to assembler output only and looking for the undefined references from their STL and using the ``nm'' facility to see an object's symbols just doesn't help. So the moral of this story is to try another compiler like Intel \CPLUSPLUS/? or should I become involved with the free-open-source movement. For now my time is limited and so I will take the first option.\fbreak 31 July 2005 @*2 MS \CPLUSPLUS/ problems.\fbreak While converting to the dynamic approach to tracing, MS \CPLUSPLUS/ compile hit the wall. It's symbol table management got confused in symbols that had common prefixes. Enough of my rants --- detour no: xxx. At least I can still keep going instead of the more fundlemental problem posted about Linux and the unresolved ctors from template instantiation.\fbreak 5 Aug. 2005 @*2 Regular parse and no input container: just parsed the empty language.\fbreak To support grammars as logic sequencers, i forgot to force a |current_token__ = yacco2::PTR_LR1_eog__;| against the current token within the parser ctor when no input token container inputted. Even though there is no token consumption taking place, the parser starts things off by fetching the first token. If there is no token present, the ctor of the parser does not set up for parsing: parse stack etc, but exits as if an empty language had been parsed. Correction:\fbreak In this case there is no token so i force the meta terminal |eog| to indicate the end-of-the-token-stream: a bit of a hack as regular parsing expects to receive its input from a token container but works as there is no token consumption by this particular grammar. This approach represents properly the empty language string when the grammar / parser consumes the token stream. Observation:\fbreak As this is a very simple correction, why wasn't it programmed properly? Again the forest versus the trees situation. Local patch without overall assessment of how parsing requires a token. Now i'm being hard on myself as it was caught with my 1st test try but the observation still holds.\fbreak 16 Aug. 2005 @*2 MS 7{.}0 heap delete bug{...}.\fbreak I commented out the delete statement so that things at least work.\fbreak 31 Oct. 2005 Goulish wonders... @*2 MS 7{.}0 bug pranks.\fbreak For now bypass the |delete_tokens| request by returning immediately out of the routine.\fbreak 31 Oct. 2005 wonders never cease... @*2 Intel \CPLUSPLUS/ release 9.\fbreak History:\fbreak Well, Intel's VTune is an excellent product that works first time. So from this experience and my problems with Red Hat's gcc compiler weaknesses of not compiling proper code, MS compiler 7.0 having little displays of irregularity lead me to try out Intel's compiler products particularly when Apple is endorsing their chips --- chip wars with salt and vinegar? Well the install was easy and the anticipation high as to performance, optimization, and space. Crunch crunch crunch --- that's the sound of the man ... Enough of my mental droppings... mumblings in karaoke. Hear's the scoup (intended): The compiler is approximately 3 times slower than MS compiler.\fbreak Code bloat is in fat city --- 5.5 times bigger. My program is 675k using MS versus 3350k for Intel\fbreak The killer, the code produced does not handle a multi-threaded program and its contexts. It loses its proper thread run context. This did take place in Visual Studio 6.0 but they corrected this in release 7. As \O2 starts with no threads --- on demand, the thread table of workers grows dynamically according to jit source context. Now the lost context, when a thread finishes it work, it sets its working status back to waiting-for-work. This setting does not happen with Intel's version of \O2. So the thread table keeps growing to approximately 2k threads created and then the program goes into a deadly wait state where all parties are politely nodding. Upon debugging this in 2 ways: log all the events textually (let's hear it for my tracings: all events turned on --- messaging between events, arbitration, tokens fetched, etc) and use of Intel's source code debugger, 2 things came out: the Intel debugger gets lost upon single stepping the source code for |set_waiting_for_work| and the smoking gun displayed its evidence as more common threads got created like |eol| where they were always busy even after their completion. All this in 3 hours of high expectations to the sobering truth that \CPLUSPLUS/ compilers are gum and shoe laced together in a top-down affair. Now Sunday 4 December, my clean up to bring me back to living with MS \CPLUSPLUS/ and its little tantrums. At least it compiles fast, and my program runs in release mode. In debug mode, MS \CPLUSPLUS/ has a bit of a problem with its memory re-cycling at program-exit time but this is now tolerated as there is nowhere to go for me at present. Hey what about Apple? i'll see how they do regarding top-down compiling. What about HP/Compaq/Dec? It worked 2 years ago so my porting of the Pascal translator will be the test with HP's new STL. Alas i'm becoming more convinced of formal methods to compiling. This certainly saddens me a lot in year end 2005... about the Intel's state of affairs regarding compiling? or was it just their \CPLUSPLUS/ implementation? I just don't know as the song goes but \CPLUSPLUS/ certainly is a dog of a language to get right particularly when porting to different platforms exposes different compiler weaknesses. Wait till the meta-language crew start exhorting their virtues. Just try single stepping those songs! \fbreak 4 Dec. 2005\fbreak @*2 Apple's cough in handling template definition.\fbreak See |Sour Apple on template definition| for an explanation of why the slight arberation and work around.\fbreak 13 Dec. 2005\fbreak \fbreak Apple's response was fast and polite. They quoted the \CPLUSPLUS/ Standard showing that this was left to the implementors and that their interpretation was appropriate. Upon reading Standard, they are right. The others (Microsoft, Intel, HP) use a more general approach and in my opinion would be the direction i would take dealing with glorified macros but kukos to the Open Source implementation. My correction was minimal to place all referenced variables before the defining template shell.\fbreak Feb. 2006 @*2 HP Alpha \CPLUSPLUS/ ``this'' object mis-address.\fbreak See |worker_thread_blk initialization: threaded grammar|.\fbreak The launched thread places the |worker_thread_blk| ``this'' pointer within the |Parallel_thread_table| for thread reuse. Unfortunately the address of this object is not the same as the address within the containing grammar's parser object. Apple and Microsoft got it right!\fbreak \fbreak The fix:\fbreak As the parser object containing it is also passed for tracing purposes, i now fetch its address thru the parser's object. \fbreak 10 May 2006. \fbreak \fbreak Take 1.329...\fbreak The problem was |ctor()| producing a temporary variable that became |ctor(ctor(x)\&))| in the initialization list of a defining ctor. Eg, box A contains box B where box B has only B(x) ctor. |A::A():b_(B(x)){};| is the problem.\fbreak The ctor of box B in the list produces a temporary variable and \CPLUSPLUS/ creates an implicit default ctor of |B()| and an implicit copy |ctor(B\&)|. Why did u not just program |A::A():...b_(x){};|? where the argument to the |b_| variable in the list is a regular ctor declaration? U got it, this is circa code of 1998 where the \CPLUSPLUS/ compilers were not so good and that was the only way to initialize the variable in the list. Now 3 flavors of MS \CPLUSPLUS/ compilers, 1 old Alpha compiler, and Apple's compiler morphed the code seeing that a temporary variable is not needed and respected the old way of compiling. Alas the vagaries of the past the present the future.\fbreak 20 July, 2006\fbreak @*2 Rule reuse but forgot to remove the ``AD'' from each grammar.\fbreak For speed, the mallocing of rules is too expensive so i calculated its re-use count. See {\bf |rules_use_cnt|} grammar on how it's done. The push / pop of the parse stack's symbols having each rule's ``AD'' auto delete attribute turned on got deleted every time it was popped. Consequence: any reference to the rule became a ghost reference.\fbreak \fbreak Solution: just remove the attribute declaration from each rule within their grammars.\fbreak Nov. 2007\fbreak @*2 Recursion on ``Procedure call'' of a thread.\fbreak Ugly things happen as the thread's cloned ``procedure call'' is {\bf not re-entrant} due to ctor / run / dtor overhead. Its fsm table is global and can only support 1 call at a time. This is a design decision for speed reasons. Needed is a recursion detection table |Parallel_thread_proc_call_table| to register call attempts for all threads. When called as a procedure turn on the use and remove the registered use after it has return from the call. This table is mutex protected unfortunately but necessary due to parallelism. \fbreak Apr. 2008\fbreak @*2 VMS misqueue on Mutex Recursion and Pthread stacksize.\fbreak Ugly things happen as the thread is activated. The pthread's default stack size |pthread_attr_t| variable does not set the stack size properly. causes the pthread library to throw up. So explicitly set it using the |pthread_attr_setstacksize| procedure before the pthread create. The second more serious issue is its detection of what it thinks is recursion on a single use mutex. It's reaction is down right violent --- spews of core dump and attempts at calming the hoard with information messages of potential inaccuracies. This reaction is illussionary as this is not so. Each thread or its singular procedure partner has their own private copy for the control message Mutex and Conditional variables. This was tested on Unix out-of-the-box Pthread library variants (Sun and Apple) without this hacking or is it gagging? So just remove the ``procedure call'' optimization for VMS and make it a thread call. \fbreak Aug. 2008\fbreak @*2 \QUEshift{} used instead of \ALLshift{} making it a perpetual motion machine.\fbreak Guard against \QUEshift as it does not advance |get_next_token| so the parse keeps on going dancing at the same token spot: this is perpetual motion machine --- swap file eventually fills up and Boom Ca Boom. Sometimes the grammar writer is using improperly the \QUEshift instead of an epsilon rule. So how to detect this? Well if the |has_questionable_shift_occured__| has been previous set, then stop parsing instead of aborting. Should i message or not to message that is the question. I'll message the errant grammar and parse stack state where the problem was detected. The grammar writer should use the \ALLshift{} symbol.\fbreak Patched |@|.\fbreak Sept. 2008\fbreak @*2 Rule reuse Code emmission did not store the newed rule in its recycle table.\fbreak I did not store the newed grammar rule in its recycle table. This was brought out using a marvellous tool call dtrace from Sun. Well the thought was right but my details were wrong --- like the kid who runs ahead in thought while learning to crawl. The other part to rule recycling is making sure local grammar rule's variables are re-initialized as the past dribbles will effect the present. Speak clearly boy! Example, in |la_expr| grammar |Ra| and |Rt| rules contain the local set |fset_| variable. This holds the terminal in the lookahead expression so that set ``union and difference'' expressions can take place. Having a recycled rule with this set not cleared will contain its past history. This is the cost of an optimization: 25\% improvement so be forewarned.\fbreak \fbreak Dec. 2008\fbreak @*2 String template container did not set the |eof_pos_| variable and random boom.\fbreak The sky is falling. As the string container didn't set this variable, random droppings other than EOF meant that at least a first read on the string container would take place. Well u guessed it. As it was never read the eof symbol was not set and so nil pointer on the returned token. At least the file container set |eof_pos_| properly. Alas just sloppiness Dave and a swill to u. \fbreak Mar. 2009\fbreak @*2 |TOKEN_GAGGLE|'s virtual table access [] operator not respected.\fbreak This showed up in an xml/message dispatcher system written for VMS/Alpha. The ``Error queue'' being parsed was getting an array out-of-bounds error when the end-of-token stream was reached. THE |TOKEN_GAGGLE|'S ACCESS [] PROTECTS AGAINST THIS. But the internal container used aka STL's array container was being called directly. This problem only occured in c++ VMS/Alpha port. Sun, Apple, Linux flavours all worked by respecting the virtual table of the abstracted |tok_base|. So tighter checks within the |get_next_token| Parse method is done ensuring the |current_token__| is always set on an empty container or any of the overflow checks. Originally |current_token__| was set only when the overflow was first detected. As a post evaluation, the Parser ``Error queue'' which was originally declared as a |TOKEN_GAGGLE| is now declared as an abstract |tok_base| just like the other containers Supplier, Producer, and Recycle bin. This allows the language designer to use a different Error container like trees. In conclusion, though not a bug but a porting weakness, this modification makes the Parser more flexible. So Dave your fixed Error thoughts are virtualized. \fbreak Oct. 2010\fbreak @*2 Procedure calls in VMS revisited: thread versus procedure.\fbreak Revisited the optimization on procedure calling of grammars when only 1 grammar is to be called. This is a major improvement over thread calls. Well this is the scoop. Make sure that the stack paramater to the VMS linker is adequate or not fun abortive things happen within a called thread that u know works. This happened to a command that was parsed properly using the same called thead while the other command to be parsed aborted. Second, make sure thare are no overruns in a std type container happening. Somehow VMS only has a problem guarding against an overrun which is properly guarded against within \O2's library. For now the code in |@| has renamed the conditional variable |VMS_| to |VMS111__| so that it is not used. I'm keeping it there as a reminder to possible future reguritations. \fbreak Nov. 2010\fbreak @*2 Size of tree container --- number of items in container.\fbreak What is the size of the tree container? It depends whether its end-of-tree has been reached. So put a conditional test in its size method: return |MAX_UINT| if tree walk is still in process. End-of-tree reached then return the size of its internal container. \fbreak Feb. 2011\fbreak @*2 |Find_reduce_entry| current token not found.\fbreak My to my stupidity. The searching for the subrule reducing was optimized. Not to get into my stupidity but the meta symbols were found before the next subrule's LA set was searched. The correct search is 2.5 passes --- find the current tok against the potential subrules. Followed by meta symbols against a new round of potental subrules list , and then the last gasp \QUEshift is search if the previous passes not met. \fbreak Nov. 2012\fbreak @*2 Date macro use --- Apple LLVM \Cpp{} compiler.\fbreak Version 5.1 (clang-503.0.40) (based on LLVM 3.4svn).\fbreak This is caused when the version literal per \O2linker{} and \O2{} is built. See ``runtime\_env.w'' file for details. Must split lines or delimit by spaces when concatenating the macro '\_\_DATE\_\_' by bounded literals.\fbreak Example: "xxx" \_\_DATE\_\_ "yyy" //works cuz spaces\fbreak Without the spaces the compiler thinks its a template mistake with this error:\fbreak No matching literal operator for call to 'operator"' \_\_DATE\_\_ with arguments of types 'const char*' and 'unsigned long', and no matching literal operator template. \fbreak Apr. 2014\fbreak @*2 Eog symbol not gpsing on external file and internal line no.\fbreak Here's the stik. I was playing around with the Pager\_1.lex grammar. To make it interesting, the T vocabulary files were changed. By mistake the Error T vocabulary file did not have a close off brace: $\}$. So the right error was thrown but the file co-ordinates were 0 and did not reference the external file! Looking at the |tok_can| container, the end-of-file indicator was passing the appropriate file references. So what the heck? Well to the rescue, yacco2::YACCO2\_T\_\_ tracing of Tes. In all the gory details and low and behold the ``eog'' had no external references. Well the culprit was |map_char_to_raw_char_sym| that draws from its premade raw character pool and makes a T symbol. It was passed the appropriate external file's co-ordinates but... To quicken raw character mapping to a |CAbs_lr1_sym| symbol a premade |PTR_LR1_eog| symbol was just returned without setting the passed-in file co-ordinates. \fbreak \fbreak Man Dave you sure r a winner! \fbreak May 2014\fbreak @*2 Cleaned up Arbitrator's |YACCO2_AR__| tracings.\fbreak 2 items corrected:\fbreak \ptindent{1) misplaced |@;| in |TAR_2| walking accept-queue. The 5 computer nerds waiting} \ptindent{2) commented out for |TAR_1 - 3| macros use of |trace_parser_env| } \fbreak Oct 2014\fbreak