sqlite_parser/errors.py annotated source

Back to index

        

Error Handling System

This module implements a hierarchical error system with position tracking for providing helpful, context-aware error messages to users.

Design Philosophy

Good error messages are critical for usability. This system provides:

  1. Position Tracking: Every error knows where it occurred (line and column)
  2. Context Display: Shows the problematic line with a caret (^) pointing to the error
  3. Error Hierarchy: Different error types for different phases (lexing, parsing, semantics)
  4. Inheritance: All errors inherit from ParseError for easy catching

Error Hierarchy

ParseError (base) ├── LexerError (tokenization errors) │ └── InvalidTokenError ├── SyntaxError (parsing errors) │ ├── UnexpectedTokenError │ └── UnexpectedEOFError └── SemanticError (context-sensitive errors)

26
27"""
28Error Classes for SQLite SQL Parser
29
30Provides comprehensive error handling with position tracking and
31helpful error messages.
32"""
33
34from typing import Optional
35from .ast_nodes import Position, Span
36

Base Error Class

ParseError is the foundation of the error system. All parser errors inherit from it, providing consistent error formatting and position tracking across the entire parser.

Key Features

  • Stores position (line/column), span (start/end), and original sql text
  • The format_error() method creates user-friendly error messages with context:

Line 5, Column 10: Expected SELECT, found INSERT SELECT * FORM users ^

Notice the typo "FORM" instead of "FROM" - the caret points exactly to the error location.

54
55class ParseError(Exception):
56    """Base class for all parsing errors"""
57
58    def __init__(self, message: str, position: Optional[Position] = None,
59                 span: Optional[Span] = None, sql: Optional[str] = None):
60        self.message = message
61        self.position = position
62        self.span = span
63        self.sql = sql
64        super().__init__(self.format_error())
65
66    def format_error(self) -> str:
67        """Format error message with context"""
68        result = self.message
69
70        if self.position:
71            result = f"Line {self.position.line}, Column {self.position.column}: {result}"
72
73        if self.sql and self.position:
74            # Show snippet of SQL around error
75            lines = self.sql.split('\n')
76            if 0 <= self.position.line - 1 < len(lines):
77                error_line = lines[self.position.line - 1]
78                result += f"\n{error_line}\n"
79                result += " " * (self.position.column - 1) + "^"
80
81        return result
82

Lexer Errors

LexerError represents problems during tokenization - the first phase of parsing. These occur when the lexer encounters invalid characters or malformed tokens.

Example Scenarios

  • Invalid escape sequences in strings
  • Unclosed string literals
  • Invalid hex digits in BLOB literals
  • Illegal characters not part of SQL syntax
94
95class LexerError(ParseError):
96    """Error during tokenization"""
97    pass
98

Syntax Errors

SyntaxError represents problems during parsing - when tokens are valid but arranged in an invalid grammar structure.

This is the most common error type, raised when the parser encounters unexpected tokens or missing required syntax elements.

106
107class SyntaxError(ParseError):
108    """Syntax error during parsing"""
109    pass
110

Unexpected Token Error

UnexpectedTokenError is raised when the parser expects one token type but finds another. This is the workhorse of syntax error reporting.

Example

python Expected TokenType.FROM, found TokenType.WHERE

The parser maintains the expected and found values for detailed error messages and potential error recovery in the future.

124
125class UnexpectedTokenError(SyntaxError):
126    """Unexpected token encountered"""
127
128    def __init__(self, expected: str, found: str,
129                 position: Optional[Position] = None,
130                 span: Optional[Span] = None, sql: Optional[str] = None):
131        message = f"Expected {expected}, found {found}"
132        super().__init__(message, position, span, sql)
133        self.expected = expected
134        self.found = found
135

Unexpected EOF Error

UnexpectedEOFError is raised when the parser reaches the end of input while still expecting more tokens.

Example Scenarios

  • SELECT * FROM (missing table name)
  • CREATE TABLE users ( (unclosed parenthesis)
  • BEGIN TRANSACTION (missing statement after transaction start)

This is a special case of unexpected token where the "found" token is always EOF.

148
149class UnexpectedEOFError(SyntaxError):
150    """Unexpected end of input"""
151
152    def __init__(self, expected: str, position: Optional[Position] = None,
153                 sql: Optional[str] = None):
154        message = f"Unexpected end of input, expected {expected}"
155        super().__init__(message, position, None, sql)
156

Invalid Token Error

InvalidTokenError is a specific LexerError for malformed tokens that can't be recovered from. This is rarer than syntax errors because the lexer is quite permissive (most invalid SQL will tokenize successfully, then fail during parsing).

162
163class InvalidTokenError(LexerError):
164    """Invalid token in input"""
165    pass
166

Semantic Errors

SemanticError represents context-sensitive errors that aren't purely syntactic. These require understanding the meaning of the SQL, not just its structure.

Example Scenarios (not yet implemented)

  • Using a column name that doesn't exist
  • Type mismatches in expressions
  • Invalid function argument counts
  • Ambiguous column references

Currently this class exists as a placeholder for future semantic analysis features.

180
181class SemanticError(ParseError):
182    """Semantic error (context-sensitive)"""
183    pass
184