Commits

Show all
Author Commit Message Labels Comments Date
Phil Hargett
Fleshed out HTML tests, with several negative tests. So far so good; gonna merge next
Branches
parsing
Phil Hargett
Adding profile information mostly as a reference for future use.
Branches
parsing
Phil Hargett
Added defgrammar and deflexer to streamline new definitions of both. Simplified exports and added to package file. Verifying basic HTML parsing still works.
Branches
parsing
Phil Hargett
Appear to have a working LALR(1) parser for a basic html grammar
Branches
parsing
Phil Hargett
Optimized grammar construction a bit by using a vector to hold states, and pre-computing the exits from each state (e.g., for a given symbol that transitions out of the state, the index of the state to transition to)
Branches
parsing
Phil Hargett
Small cleanups in grammar
Branches
parsing
Phil Hargett
Although unwieldy, the html grammar is sort of working--can parse first token successfully
Branches
parsing
Phil Hargett
I believe I have added code that removes all occurrences of :nil from a grammar, after making proper substitutions first to expand the rules that remain. Visual inspection of HTML grammar looks correct, and the resulting expanded rule set is not significantly larger than the original.
Branches
parsing
Phil Hargett
Refinements; need to remove :nil from grammars now
Branches
parsing
Phil Hargett
Cleaned up lexer error report function, and streamlined html lexer code more
Branches
parsing
Phil Hargett
Functional html lexer; lots of cleanup still to do, so work in progress
Branches
parsing
Phil Hargett
Removed obsolete package lines from system definition (were commented out)
Branches
parsing
Phil Hargett
Restructuring, splitting grammar, samples, and parser into separate files, plus other cleanup
Branches
parsing
Phil Hargett
Source code datastructures available again
Branches
parsing
Phil Hargett
Removed unnecessary globals
Branches
parsing
Phil Hargett
Removed unused LR0 constructs
Branches
parsing
Phil Hargett
Added grammar object, distinct from parser, and distinct from the grammar's specification
Branches
parsing
Phil Hargett
Work in progress; initial implementation of parsing algorithm and table construction
Branches
parsing
Phil Hargett
Now building LR1 items
Branches
parsing
Phil Hargett
Converted to using typed items & productions
Branches
parsing
Phil Hargett
omg, am I paying attention? still more duplicate code
Branches
parsing
Phil Hargett
Whoops, duplicate definitions
Branches
parsing
Phil Hargett
Splitting out grammar into separate file (and overwriting what we had before)
Branches
parsing
Phil Hargett
Basic LR0 item-set construction complete, and a suitable extended grammar properly generates fundamental grammar forms.
Branches
parsing
Phil Hargett
Temporarily re-incorporating (but not using) the work to transform grammars with loops, optional elements, and alternatives.
Phil Hargett
Just adding additional clarifying comments
Phil Hargett
Small refactorings of parser code in preparation for a possible tokenizer variation
Phil Hargett
More progress towards tokenizing. Separated HTML tokens out into a separate file, and based the grammar on that.
Phil Hargett
Renamed some methods/slots to reference tokens rather than characters, in preparation for moving to token streams, not character streams
Phil Hargett
Removed some tracing that is a bit noisy now, since we've solved our copy thread problems (yes, it could be bad to remove the tracing that actually solved a problem, while leaving all the other stuff intact. ;) )
  1. Prev
  2. Next