AST (Abstract Syntax Tree)
Abstract Syntax Tree.
- Author
jguillem pulgamecanica
Enums
-
enum t_node_type
# Example $> (which ls) | echo "dude" && which top
Warning
Important to don’t change the order
Values:
-
enumerator NODE_COMMAND
-
enumerator NODE_PIPE
-
enumerator NODE_AND
-
enumerator NODE_OR
-
enumerator NODE_SEQUENCE
-
enumerator NODE_SUBSHELL
-
enumerator NODE_BLOCK
-
enumerator NODE_BACKGROUND
-
enumerator NODE_COMMAND
Functions
-
t_ast *ast_new_binary(t_node_type type, t_ast *left, t_ast *right)
-
t_ast *ast_new_group(t_node_type type, t_ast *child, t_list *redirs)
-
t_redir *redir_new(t_token_type type, int fd, char *target)
-
struct t_redir
- #include <ast.h>
Redirection data node.
Stored in t_cmd.redirs and t_group.redirs as t_list* (each node->content is a t_redir *).
Public Members
-
t_token_type type
Operator kind (TOK_REDIR_IN/OUT/APPEND/HEREDOC/…).
-
int fd
Source fd (-1 = default for the operator type).
-
char *target
Raw filename or fd-number string (unexpanded).
-
char *heredoc_delim
Delimiter word for << (raw, may be quoted).
-
int heredoc_fd
Read end of pipe holding collected heredoc body.
-
int heredoc_quoted
1 if delimiter was quoted (suppresses expansion).
-
t_token_type type
-
struct t_cmd
- #include <ast.h>
Simple command data.
Stored in t_ast.data as a union with further access
t_ast->data.cmd
-
struct t_binary
- #include <ast.h>
Binary operation data (pipe, &&, ||, ;).
-
struct t_group
- #include <ast.h>
Group data (subshell, block, background).
-
struct t_ast
- #include <ast.h>
AST node - union-based for zero-overhead dispatch.
Public Members
-
t_node_type type
Discriminator selecting which arm of
datais live.
-
t_node_type type
How is the t_ast generated?
The abstract syntax tree is generated from the tokenized line. note: The shell must be provided in order to handle the heredoc, no other reason.
t_ast *parser_parse(t_list *tokens, t_shell *shell)
Ast to string
Stringify an AST subtree for human-readable job listings (t_jobs).
- Author
pulgamecanica
Parser
Recursive-descent parser: transforms the token stream into an AST.
- Author
jguillem pulgamecanica
Functions
-
t_ast *parser_parse(t_list *tokens, t_shell *shell)
consume a t_list* of tokens and return the AST root.
Returns NULL on syntax error (error printed to stderr).
The token list is NOT freed here, caller frees it with lexer_free_tokens().
this is the main function of the parser module
call parse list which wrap all the layers and check for EOF
parse_list (“;” and “&” separators) -> parse_and_or
parse_and_or (”&&” and “||” separators) -> parse_pipeline
parse_pipeline (“|” separator) -> parse_command
parse_command (read simple command) -> parse_subshell
parse_subshell recurse on parse_list
parse_heredoc walk ast and collect heredocs
-
int parser_collect_heredocs(t_ast *ast, t_shell *shell)
walk the AST after parsing and read heredoc content
from stdin for each << redirection.
shell is needed to read from the correct fd and to check SIGINT.
Returns 0 on success, -1 if SIGINT aborted heredoc input.
-
int parser_accept(t_parser *p, t_token_type type)
-
int is_redir(t_token_type type)
-
void cmd_free(t_cmd *cmd)
Free a partially-built t_cmd on a parse error.
Releases argv (and every strdup’d entry), assignments list, redirs list (via
redir_free), and the struct itself.
-
void redir_free(t_redir *redir)
-
struct t_parser
- #include <parser.h>
Parser internal state.
Parser Entry Point
main file of parser module
file for parser utils functions
- Author
jguillem, pulgamecanica
- Author
jguillem
Functions
-
t_ast *parser_parse(t_list *tokens, t_shell *shell)
consume a t_list* of tokens and return the AST root.
this is the main function of the parser module
call parse list which wrap all the layers and check for EOF
parse_list (“;” and “&” separators) -> parse_and_or
parse_and_or (”&&” and “||” separators) -> parse_pipeline
parse_pipeline (“|” separator) -> parse_command
parse_command (read simple command) -> parse_subshell
parse_subshell recurse on parse_list
parse_heredoc walk ast and collect heredocs
Simple Commands
file for parse command
- Author
jguillem
Functions
-
static char *collect_arith_token(t_parser *p)
helper to reconstruct “$((a + b)” from token into one string
- Parameters:
p – t_parser struct
- Returns:
char *string reconstructed
-
static int is_assignment(char *str)
Check whether
stris a POSIX assignment word.The portion BEFORE the first ‘=’ must be a valid identifier (alpha or ‘_’ first, then alnum or ‘_’). The portion AFTER the ‘=’ is the value and may contain any characters, including quotes - quote handling is the expander’s job.
- Parameters:
str – char *
- Returns:
1 if
stris a NAME=VALUE assignment, 0 otherwise.
Pipes
file for pipeline parser
- Author
jguillem
Logical Operators
file for && and || parser
- Author
jguillem
Functions
-
static int detect_and_or(t_parser *p, t_node_type *type)
helper function to detect TOK_AND and TOK_OR
- Parameters:
p – struct s_parser
type – enum e_node_type
- Returns:
0 (no one), 1 (TOK_AND), 2 (TOK_OR)
Sequences and Background
file for parse list
- Author
jguillem
Functions
-
static int detect_separator(t_parser *p, t_node_type *operator)
helper function for parse_list. Newlines act as command terminators (POSIX 2.10) so
;and a literal\nbetween commands are equivalent. This matters for multi-line()/{}groups and for any input that came throughshell_read_logical_line, which joins physical lines with\nto preserve quote interiors and paren grouping.- Parameters:
p – struc s_parser pointer
operator – enum e_node_type pointer
- Returns:
0 (no separator), 1 (; or
separator), 2 (& separator)
Subshells and Blocks
file for parse group
- Author
jguillem
Functions
-
t_ast *ast_new_group(t_node_type type, t_ast *child, t_list *redirs)
-
static void parse_group_cleanup(t_ast *child, t_list *redirs)
Free the
childAST and any partially-builtredirslist.Called from every error path in
parse_group,parse_subshellandparse_blockso a half-built subshell/block cannot leak its inner AST plus accumulated redirections on a syntax error (e.g. random-binary input that parses partway then dies).
-
static t_ast *parse_group(t_parser *p, t_ast *child, t_node_type node)
factorization of the end of parse_subshell and parse_block
- Parameters:
p – struct s_parser pointer
child – struct s_ast pointer
node – enum e_node_type
- Returns:
pointer on struct s_ast
Here-documents
file to handle heredoc
- Author
jguillem
Functions
-
static char *strip_tab(char *origin)
strip the leading tabulations of origin
- Parameters:
origin – : raw string
- Returns:
an allocated cleaned string
-
static int write_line(int fd, char *line, size_t len)
-
static int pop_queued_heredoc(t_shell *shell, int fd)
Pop the next pre-collected heredoc body from
shell->heredoc_body_queueand write it tofd.The REPL’s
shell_read_logical_linescans command lines for<<DELIMoperators and pushes one body per heredoc onto the queue, in declaration order. The parser’s AST walk visits heredocs in the same order, so a plain FIFO pop is correct. Bodies are already tab-stripped per<<-rules and contain newline-terminated lines (or “” for empty bodies).- Returns:
0 on success, -1 on write failure.
-
static int read_heredoc(t_redir *redir, t_shell *shell, int fd)
Read or recover the heredoc body. If the REPL already pre-collected the body into
shell->heredoc_body_queue(the case for any input that came throughshell_read_logical_line, i.e. both interactive and piped stdin), we pop it. Otherwise we fall back to reading line-by-line viashell_read_line(used whenparser_parseis called outside the REPL, e.g. unit tests or-cmode — though-cmode doesn’t yet handle heredocs since the lexer doesn’t recognise body lines inline).- Parameters:
redir – : a pointer on struct s_redir
shell – : shell state, used to reach the shared input reader
fd – : the backing-store fd the heredoc body is written to
- Returns:
0 (success) | -1 (failure or SIGINT)
-
static int collect_heredoc(t_redir *redir, t_shell *shell)
collect one heredoc body into an unlinked temp file and store the readable fd (rewound to offset 0) in redir->heredoc_fd. There is no fork: the collector runs in the shell process and therefore shares the one stdin cursor used to read command lines.
- Parameters:
redir – : pointer on a struct s_redir
shell – : shell state
- Returns:
0 | -1
-
static int collect_heredocs_from_list(t_list *lst, t_shell *shell)
helper function for collect heredocs from command or subshell
store redirections
- Parameters:
lst – : a struct s_list pointer
shell – : shell state
-
static int collect_heredocs_from_command(t_cmd *cmd, t_shell *shell)
collect redirections of commands
- Parameters:
cmd – : struct s_cmd pointer
shell – : shell state
-
static int collect_heredocs_from_group(t_group *group, t_shell *shell)
collect redirections of group
- Parameters:
group – : struct s_group pointer
shell – : shell state
-
void heredoc_expand_config(t_redir *redir)
-
static int ast_walk(t_ast *ast, t_shell *shell)
traverses the ast tree and collect heredocs content, stopping on SIGINT
- Parameters:
ast – : a pointer on a struct s_ast
shell – : the shell struct
- Returns:
0 on success, -1 on SIGINT
-
static void heredoc_sigint_handler(int signal)
Variables
-
volatile sig_atomic_t g_sigint_heredoc = 0
Utilities
Functions
-
t_ast *ast_new_binary(t_node_type type, t_ast *left, t_ast *right)
-
int is_redir(t_token_type type)
-
int parser_accept(t_parser *p, t_token_type type)
AST Memory Management
file for free memory functions
- Author
jguillem