GitHub - maiermic/antlr4-ace-ext: Tokenizer for ACE editor to do syntax highlighting using an ANTLR4 lexer.

Tokenizer for ACE editor to do syntax highlighting using an ANTLR4 lexer.

How to install

Use bower to install:

bower install --save antlr4-ace-ext

You can install ACE editor from bower, too:

bower install --save ace-builds

How to use

After ace is loaded

<script src="bower_components/ace-builds/src-noconflict/ace.js"></script>

add scripts:

<script src="bower_components/antlr4-ace-ext/src/token-type-map.js"></script>
<script src="bower_components/antlr4-ace-ext/src/tokenizer.js"></script>

They register themselves as ACE modules ace/ext/antlr4/tokenizer and ace/ext/antlr4/token-type-map. You can require them in your mode:

ace.define(
  'ace/mode/my-mode',
  [
    "require",
    "exports",
    "module",
    "ace/ext/antlr4/tokenizer",
    "ace/ext/antlr4/token-type-map"
  ],
  function(require, exports, module) {
    var createTokenTypeMap = require('ace/ext/antlr4/token-type-map').createTokenTypeMap;
    var Antlr4Tokenizer = require('ace/ext/antlr4/tokenizer').Antlr4Tokenizer;
    // ...
  }
}

Override the getTokenizer method of your mode class to use you custom tokenizer:

MyMode.prototype.getTokenizer = function() {
  if (!this.$tokenizer) {
    this.$tokenizer = new Antlr4Tokenizer(MyLanguageLexer, antlrTokenNameToAceTokenType);
  }
  return this.$tokenizer;
};

The Antlr4Tokenizer constructor takes an lexer class generated by ANTLR4 and a mapping of ANTLR4 token names to ACE token types. The mapping describes which ANTLR4 token name refers to which ACE token type (see common ACE tokens).

{
  "'+'": 'keyword.operator',
  "'-'": 'keyword.operator',
  "'return'": 'keyword.control',
  "ID": 'identifier',
  "INT": 'constant.numeric'
}

You can use the helper function createTokenTypeMap to create a token type map for your Antlr4Tokenizer:

var antlrTokenNameToAceTokenType = createTokenTypeMap({
  literals: {
    'keyword.operator': ['+', '-'],
    'keyword.control': 'return'
  },
  symbols: {
    'identifier': 'ID',
    'constant.numeric': 'INT'
  }
});

Thereby, you do not have to quote literal token names and you can map multiple token names as array to the same ACE token type.

Example

See the browser example of the Cymbol language (Demo).

6.4 Parsing Cymbol

To demonstrate how to parse a programming language with syntax derived from C, we’re going to build a grammar for a language I conjured up called Cymbol. Cymbol is a simple non-object-oriented programming language that looks like C without struct s.

from The Definitive ANTLR 4 Reference

How to build

Required

Node.JS
ANTLR4 (antlr4 has to be available as environment variable to (re-) build grammar files)

Build Instructions

Install dependencies: npm install
Build project: npm run build
Run tests: npm test

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
grammar		grammar
parser		parser
src		src
test		test
.gitignore		.gitignore
.nvmrc		.nvmrc
.travis.yml		.travis.yml
README.md		README.md
bower.json		bower.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How to install

How to use

Example

6.4 Parsing Cymbol

How to build

Required

Build Instructions

About

Releases

Packages

Languages

maiermic/antlr4-ace-ext

Folders and files

Latest commit

History

Repository files navigation

How to install

How to use

Example

6.4 Parsing Cymbol

How to build

Required

Build Instructions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages