sqlite_parser/debug.py annotated source
Back to indexDebugging Utilities
This module provides comprehensive debugging tools for inspecting parser internals. These utilities are invaluable when developing new parser features or diagnosing issues.
Debugging Capabilities
- Token Stream Visualization: Pretty-print tokens with optional highlighting
- AST Formatting: Hierarchical display of AST node trees
- Parser Tracing: Log parser method calls and decisions
- State Inspection: View parser state at any point
- Context Managers: Temporarily enable debugging
- High-Level Debug Parse: One-function debugging workflow
Usage Patterns
Quick Debugging
```python from sqlite_parser.debug import debug_parse
Parse with full tracing
statements = debug_parse("SELECT * FROM users", verbose=True) ```
Detailed Token Inspection
```python from sqlite_parser import tokenize_sql from sqlite_parser.debug import print_tokens
tokens = tokenize_sql("SELECT id FROM users") print_tokens(tokens, highlight_pos=2) # Highlight token at position 2 ```
AST Inspection
```python from sqlite_parser import parse_sql from sqlite_parser.debug import print_ast
ast = parse_sql("SELECT * FROM users") print_ast(ast) # Pretty-print entire AST tree ```
Parser State Tracking
```python from sqlite_parser.debug import parser_debug_context
with parser_debug_context(parser): result = parser.parse() state = parser.get_state() print_state(state) ```
5657"""58Debug utilities for SQLite parser5960Provides formatting and inspection tools for debugging the parser,61including token stream visualization, AST formatting, and parser tracing.62"""6364from typing import List, Any65from contextlib import contextmanager66from .lexer import Token67from .ast_nodes import ASTNode68from .parser import Parser69Token Stream Formatter
format_token_stream() creates a tabular display of all tokens with their types
and values. This is crucial for understanding how the lexer tokenized the input.
Features
- Indexed: Each token shows its position in the stream
- Type Display: Token type name (SELECT, IDENTIFIER, NUMBER, etc.)
- Value Display: The actual text value
- Highlighting: Optional
>>>marker for a specific token position
Output Example
```
Token Stream
[ 0] SELECT 'SELECT'
[ 1] STAR '*' [ 2] FROM 'FROM' [ 3] IDENTIFIER 'users' ====================================================================== ```
The highlighting helps visualize parser position during debugging.
9697def format_token_stream(tokens: List[Token], highlight_pos: int = None) -> str:98 """99 Format token stream for pretty printing100101 Args:102 tokens: List of tokens to format103 highlight_pos: Optional position to highlight104105 Returns:106 Formatted string representation107 """108 lines = []109 lines.append("=" * 70)110 lines.append("Token Stream")111 lines.append("=" * 70)112113 for i, token in enumerate(tokens):114 marker = ">>>" if i == highlight_pos else " "115 type_str = f"{token.type.name:15}"116 value_str = f"{repr(token.value):20}"117 lines.append(f"{marker} [{i:3d}] {type_str} {value_str}")118119 lines.append("=" * 70)120 return "\n".join(lines)121AST Tree Formatter
format_ast() recursively formats AST node trees with proper indentation to show
the hierarchical structure. This is essential for understanding what the parser built.
Formatting Rules
- Nodes: Show type name and attributes (excluding span for clarity)
- Lists: Format as
[...]with indented items - Primitives: Show with
repr()for clarity (strings show quotes) - None: Displayed explicitly
- Nesting: Each level indents 2 spaces
Example Output
SelectStatement(
select_core=
SelectCore(
columns=
[
ResultColumn(
expression=
Identifier(
name='id'
)
)
]
from_clause=
FromClause(
source=
TableReference(
name=QualifiedIdentifier(
parts=['users']
)
)
)
)
)
This visualization makes it easy to verify the parser built the correct structure.
164165def format_ast(node: Any, indent: int = 0) -> str:166 """167 Format AST node tree for pretty printing168169 Args:170 node: AST node to format171 indent: Current indentation level172173 Returns:174 Formatted string representation175 """176 if node is None:177 return " " * indent + "None"178179 if isinstance(node, list):180 if not node:181 return " " * indent + "[]"182183 lines = [" " * indent + "["]184 for item in node:185 lines.append(format_ast(item, indent + 2))186 lines.append(" " * indent + "]")187 return "\n".join(lines)188189 if not isinstance(node, ASTNode):190 return " " * indent + repr(node)191192 # Format AST node193 node_type = type(node).__name__194 lines = [" " * indent + f"{node_type}("]195196 # Get node attributes (excluding span)197 attrs = {}198 for key, value in node.__dict__.items():199 if key != 'span' and value is not None:200 attrs[key] = value201202 for key, value in attrs.items():203 if isinstance(value, (list, ASTNode)):204 lines.append(" " * (indent + 2) + f"{key}=")205 lines.append(format_ast(value, indent + 4))206 else:207 lines.append(" " * (indent + 2) + f"{key}={repr(value)}")208209 lines.append(" " * indent + ")")210 return "\n".join(lines)211212213def format_parser_trace(trace_log: List[str]) -> str:214 """215 Format parser trace log for pretty printing216217 Args:218 trace_log: List of trace messages219220 Returns:221 Formatted string representation222 """223 lines = []224 lines.append("=" * 70)225 lines.append("Parser Trace")226 lines.append("=" * 70)227 lines.extend(trace_log)228 lines.append("=" * 70)229 return "\n".join(lines)230231232def format_parser_state(state: dict) -> str:233 """234 Format parser state dictionary for pretty printing235236 Args:237 state: State dictionary from parser.get_state()238239 Returns:240 Formatted string representation241 """242 lines = []243 lines.append("Parser State:")244 lines.append(f" Position: {state['pos']}")245 lines.append(f" Current Token: {state['token_type']}:{repr(state['token_value'])}")246 lines.append(f" Depth: {state['depth']}")247 lines.append(f" Active Method: {state['active_method']}")248249 if state['stack']:250 lines.append(f" Call Stack:")251 for i, method in enumerate(state['stack']):252 lines.append(f" {i}. {method}")253254 return "\n".join(lines)255256257@contextmanager258def parser_debug_context(parser: Parser, enable: bool = True):259 """260 Context manager for temporarily enabling parser debug mode261262 Args:263 parser: Parser instance264 enable: Whether to enable debug mode265266 Yields:267 Parser with debug enabled268269 Example:270 with parser_debug_context(parser):271 result = parser.parse()272 parser.print_trace()273 """274 old_debug = parser.debug275 parser.debug = enable276277 try:278 yield parser279 finally:280 parser.debug = old_debug281282283def print_tokens(tokens: List[Token], highlight_pos: int = None):284 """285 Print token stream to stdout286287 Args:288 tokens: List of tokens289 highlight_pos: Optional position to highlight290 """291 print(format_token_stream(tokens, highlight_pos))292293294def print_ast(node: Any):295 """296 Print AST tree to stdout297298 Args:299 node: AST node or list of nodes300 """301 print("=" * 70)302 print("AST")303 print("=" * 70)304 print(format_ast(node))305 print("=" * 70)306307308def print_state(state: dict):309 """310 Print parser state to stdout311312 Args:313 state: State dictionary from parser.get_state()314 """315 print(format_parser_state(state))316One-Function Debugging
debug_parse() is the fastest way to debug parser issues. It performs the complete
parse workflow with optional verbose output showing every step.
What It Does
- Lexes the SQL into tokens
- Prints token stream (if verbose)
- Parses with debug tracing enabled
- Prints parser trace log (if verbose)
- Prints final AST (if verbose)
- Returns parsed statements
Usage
```python
Quick parse with no output
statements = debug_parse("SELECT * FROM users")
Full debugging output
statements = debug_parse("SELECT * FROM users", verbose=True) ```
Verbose Output Includes
- Token stream with indices and types
- Parser trace showing method calls and token consumption
- Final AST tree structure
This is perfect for troubleshooting: paste in problematic SQL, set verbose=True,
and see exactly what the parser is doing at each step.
349350def debug_parse(sql: str, verbose: bool = False) -> List[ASTNode]:351 """352 Parse SQL with debug tracing enabled353354 Args:355 sql: SQL string to parse356 verbose: If True, print trace after parsing357358 Returns:359 List of parsed statements360361 Example:362 statements = debug_parse("SELECT * FROM users", verbose=True)363 """364 from .lexer import Lexer365366 lexer = Lexer(sql)367 tokens = lexer.tokenize()368369 if verbose:370 print_tokens(tokens)371 print()372373 parser = Parser(tokens, debug=True)374375 try:376 result = parser.parse()377378 if verbose:379 print(format_parser_trace(parser.get_trace_log()))380 print()381 print_ast(result)382383 return result384385 except Exception as e:386 if verbose:387 print(format_parser_trace(parser.get_trace_log()))388 print()389 print(f"Error: {e}")390 raise391392393# Export main functions394__all__ = [395 'format_token_stream',396 'format_ast',397 'format_parser_trace',398 'format_parser_state',399 'parser_debug_context',400 'print_tokens',401 'print_ast',402 'print_state',403 'debug_parse',404]405