Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tokenize and colorize asm strings #2417

Merged
merged 64 commits into from
Jul 27, 2022
Merged
Show file tree
Hide file tree
Changes from 62 commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
9eb55d2
Introducing asm tokens
Rot127 May 4, 2022
edca53e
Pass reg sets to generic parser for register matching.
Rot127 May 4, 2022
5a304d7
Remove unnecessary sorting (tokens are sorted due to sequencial parsing)
Rot127 May 4, 2022
dd03aaf
Alwas sort regex parsed tokens.
Rot127 May 4, 2022
9132b22
Fix generic parsing: i = start of token, l = length. Fix ambiguouities.
Rot127 May 4, 2022
a20c05a
Rename `reg_set` -> `reg_sets`
Rot127 May 4, 2022
59f60f1
Rename `l` -> `j` since it is not the length.
Rot127 May 4, 2022
031ab4c
Update documentation.
Rot127 May 4, 2022
29e1de6
Remove unecessary sort.
Rot127 May 4, 2022
a659981
Replace self build `is_alpha` type functions with C89 ones where poss…
Rot127 May 4, 2022
1b7e379
Fix up regex parsing after previous changes
Rot127 May 4, 2022
b8689ff
Fix unit tests.
Rot127 May 4, 2022
f9dd5ad
Simplify token range check.
Rot127 May 4, 2022
128390d
Add debug include guard
Rot127 May 4, 2022
7cd9e20
Move color reset info into RzPrint.
Rot127 May 4, 2022
63f58c0
Remove tokenize/colorize function to enable set of op type.
Rot127 May 4, 2022
3753b5e
Move tokenize logic into RzAsm
Rot127 May 4, 2022
c111d03
Move `is_hex_prefix` to rz_num.
Rot127 May 4, 2022
60dd2aa
Rename `rz_asm_tokenize_asm_custom` -> `rz_asm_tokenize_asm_regex`
Rot127 May 4, 2022
53558c9
Add NULL checks
Rot127 May 4, 2022
534112a
Fix Hexagon coloring.
Rot127 May 4, 2022
cbed2c3
Due to runtime increase: Remove plugins copy of tokenized asm string.
Rot127 May 5, 2022
e8776c5
Align intendation
Rot127 May 5, 2022
5e4a809
Add RZ_NONNULL hint.
Rot127 Jun 30, 2022
095d7ae
Add doxygen.
Rot127 Jun 30, 2022
457dbc7
Put token patterns into pvector instead of list.
Rot127 Jun 30, 2022
82ca4c0
Fix "imcompatible type for redefinition" during build.
Rot127 Jun 30, 2022
4579440
Add ||, == and <= as separators.
Rot127 Jun 30, 2022
26f96cb
Replace colorize_asm_string() with tokenized color method.
Rot127 Jun 30, 2022
002980b
Add helper method to colorize standard and tokenized asm strings.
Rot127 Jun 30, 2022
5d15ac2
Remove rz_print_colorize_opcode()
Rot127 Jun 30, 2022
d835f7f
Add seemingly senseless clang-formats
Rot127 Jul 1, 2022
60f61b9
Fix color in broken tests.
Rot127 Jul 1, 2022
7dace13
Add analysis op type to the parse parameters.
Rot127 Jul 2, 2022
1f67f1c
Replace code duplicates with rz_asm_colorize_asm_str.
Rot127 Jul 2, 2022
18b161b
Fix: Colored lea instructions recognition.
Rot127 Jul 2, 2022
c9f18dc
Add recognize SP special hexagon register.
Rot127 Jul 2, 2022
c8a70e6
Parse arm mnemonics like `adc.w` not as number.
Rot127 Jul 2, 2022
1025dbf
Add ARM asm color test.
Rot127 Jul 2, 2022
51f5388
Add x86 color test & format
Rot127 Jul 2, 2022
266ce58
Add `::` as separator
Rot127 Jul 2, 2022
5d3af70
Add hexagon custom colorize unit tests.
Rot127 Jul 2, 2022
011b489
Use sizeof instead of hard coded buf size.
Rot127 Jul 2, 2022
e577e23
Parse prefixless numbers and update documentation about it.
Rot127 Jul 20, 2022
08fea67
Add TMS320 C5000 tests.
Rot127 Jul 20, 2022
2c07f48
Run clang-format
Rot127 Jul 20, 2022
4dde707
Fix unit test: `test_analysis_op`
Rot127 Jul 20, 2022
c13f1cc
Fix buffer overflow: use snprintf, increase buffers.
Rot127 Jul 20, 2022
1f1eba3
Fix color related tests.
Rot127 Jul 20, 2022
372fbad
Add `!` as hexagon operator
Rot127 Jul 20, 2022
a795e48
Also return buffered instructions at address 0x0.
Rot127 Jul 20, 2022
dc28bdb
Set incrementeing PC for hexagon tests.
Rot127 Jul 21, 2022
8a23fe7
Add test for asm strings with UTF-8 chars.
Rot127 Jul 21, 2022
f440eb8
Fix test.
Rot127 Jul 21, 2022
9f9b71d
Fix Windows support:
Rot127 Jul 21, 2022
d64f53f
Fix color test on Windows.
Rot127 Jul 22, 2022
8fb437d
Reverse UTF-8 symbol removal for Windows.
Rot127 Jul 24, 2022
7ba531f
Check token coverage of asm string always, not just in debug level 2.
Rot127 Jul 26, 2022
1c1ec3c
Replace %d with PFMT macros.
Rot127 Jul 26, 2022
b1a7601
Replace NULL asserts with if branches. Add NULL checks.
Rot127 Jul 26, 2022
c3d8932
Move UNKNOWN token type to the top to make it the default in a calloc…
Rot127 Jul 26, 2022
0483178
Don't color asm string twice.
Rot127 Jul 27, 2022
5605e36
Add hints to deprecated note where to start implementing custom token…
Rot127 Jul 27, 2022
8cb54c1
Merge branch 'dev' into asm_print_token
Rot127 Jul 27, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions librz/asm/arch/hexagon/hexagon.h
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
#include <rz_config.h>
#include <rz_list.h>
#include <rz_types.h>
#include <rz_util/rz_print.h>

#define HEX_MAX_OPERANDS 6
#define HEX_PARSE_BITS_MASK 0xc000
Expand Down Expand Up @@ -135,6 +136,7 @@ typedef struct {
RzList *const_ext_l; // Constant extender values.
RzAsm rz_asm; // Copy of RzAsm struct. Holds certain flags of interesed for disassembly formatting.
Rot127 marked this conversation as resolved.
Show resolved Hide resolved
RzConfig *cfg;
RzPVector /* RzAsmTokenPattern* */ *token_patterns; ///< PVector with token patterns. Priority ordered.
} HexState;

typedef enum {
Expand Down
8 changes: 8 additions & 0 deletions librz/asm/arch/hexagon/hexagon_arch.c
Original file line number Diff line number Diff line change
Expand Up @@ -785,10 +785,14 @@ RZ_API void hexagon_reverse_opcode(const RzAsm *rz_asm, HexReversedOpcode *rz_re
memcpy(rz_reverse->asm_op, &(hi->asm_op), sizeof(RzAsmOp));
memcpy(rz_reverse->ana_op, &(hi->ana_op), sizeof(RzAnalysisOp));
rz_strbuf_set(&rz_reverse->asm_op->buf_asm, hi->mnem);
rz_reverse->asm_op->asm_toks = rz_asm_tokenize_asm_regex(&rz_reverse->asm_op->buf_asm, state->token_patterns);
rz_reverse->asm_op->asm_toks->op_type = hi->ana_op.type;
return;
case HEXAGON_DISAS:
memcpy(rz_reverse->asm_op, &(hi->asm_op), sizeof(RzAsmOp));
rz_strbuf_set(&rz_reverse->asm_op->buf_asm, hi->mnem);
rz_reverse->asm_op->asm_toks = rz_asm_tokenize_asm_regex(&rz_reverse->asm_op->buf_asm, state->token_patterns);
rz_reverse->asm_op->asm_toks->op_type = hi->ana_op.type;
return;
case HEXAGON_ANALYSIS:
memcpy(rz_reverse->ana_op, &(hi->ana_op), sizeof(RzAnalysisOp));
Expand All @@ -815,10 +819,14 @@ RZ_API void hexagon_reverse_opcode(const RzAsm *rz_asm, HexReversedOpcode *rz_re
memcpy(rz_reverse->asm_op, &hi->asm_op, sizeof(RzAsmOp));
memcpy(rz_reverse->ana_op, &hi->ana_op, sizeof(RzAnalysisOp));
rz_strbuf_set(&rz_reverse->asm_op->buf_asm, hi->mnem);
rz_reverse->asm_op->asm_toks = rz_asm_tokenize_asm_regex(&rz_reverse->asm_op->buf_asm, state->token_patterns);
rz_reverse->asm_op->asm_toks->op_type = hi->ana_op.type;
break;
case HEXAGON_DISAS:
memcpy(rz_reverse->asm_op, &hi->asm_op, sizeof(RzAsmOp));
rz_strbuf_set(&rz_reverse->asm_op->buf_asm, hi->mnem);
rz_reverse->asm_op->asm_toks = rz_asm_tokenize_asm_regex(&rz_reverse->asm_op->buf_asm, state->token_patterns);
rz_reverse->asm_op->asm_toks->op_type = hi->ana_op.type;
break;
case HEXAGON_ANALYSIS:
memcpy(rz_reverse->ana_op, &hi->ana_op, sizeof(RzAnalysisOp));
Expand Down
4 changes: 2 additions & 2 deletions librz/asm/arch/hexagon/hexagon_arch.h
Original file line number Diff line number Diff line change
Expand Up @@ -42,12 +42,12 @@ typedef struct {

#define HEX_PKT_UNK "? "
#define HEX_PKT_SINGLE "[ "
#define HEX_PKT_SINGLE_UTF8 "[ "
#define HEX_PKT_SINGLE_UTF8 "[ "
Rot127 marked this conversation as resolved.
Show resolved Hide resolved
#define HEX_PKT_FIRST_UTF8 "┌ "
#define HEX_PKT_MID_UTF8 "│ "
#define HEX_PKT_LAST_UTF8 "└ "
#define HEX_PKT_FIRST_SDK "{ "
#define HEX_PKT_SDK_PADDING " "
#define HEX_PKT_SDK_PADDING " "
#define HEX_PKT_LAST_SDK " }"
#define HEX_PKT_FIRST "/ "
#define HEX_PKT_MID "| "
Expand Down
2 changes: 2 additions & 0 deletions librz/asm/arch/hexagon/hexagon_disas.c
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
#include <rz_types.h>
#include <rz_util.h>
#include <rz_util/rz_hex.h>
#include <rz_util/rz_strbuf.h>
#include <rz_analysis.h>
#include "hexagon.h"
#include "hexagon_insn.h"
Expand Down Expand Up @@ -53661,5 +53662,6 @@ int hexagon_disasm_instruction(HexState *state, const ut32 hi_u32, RZ_INOUT HexI
sprintf(hi->mnem_infix, "invalid");
sprintf(hi->mnem, "%s%s%s", hi->pkt_info.mnem_prefix, hi->mnem_infix, hi->pkt_info.mnem_postfix);
}

return 4;
}
Loading