-
Notifications
You must be signed in to change notification settings - Fork 2
Scanning
The created LL(k) parser expects token-based input stored in a Tokenizer-derived object.
The Tokenizer class is defined in tokenizer.h and maintains a std::vector of Token objects. The Tokenizer class is designed to be inherited by a custom class that handles the storing of tokens, using some scanning method.
The Token structure maintains the following information:
int code; // a unique integer to represent this type of token
std::string text; // contains lexeme of token
size_t line_no; // contains the line number of the lexeme
size_t columno; // contains the column number of the lexeme
The Tokenizer class provides functions for basic maintenance of a std::vector of Token structures and is designed to be a base class. The derived classes are created by the user to implement a scanning method. The FlexTokenizer class, defined in flextokenizer.h, is an example Tokenizer-derived class that scans and tokenizes input using Flex.
Below is another example Tokenizer-derived class that tokenizes the bits of a binary number:
class BinaryTokenizer : public Tokenizer
{
public:
BinaryTokenizer(std::string str)
{
for (unsigned int i = 0; i < str.length(); i++)
if (str[i] == '0')
emplace_back('0',"0",1,0,0);
else if (str[i] == '1')
emplace_back('1',"1",1,0,0);
else
break;
}
};
Other examples of Tokenizer-derived classes can be seen in the Examples wiki page.