test_lexer.c

Unit tests for the 42sh lexer module.

Covers lexer_tokenize(), is_operator(), is_operator_start(), read_word(), read_operator(), token_new(), and lexer_free_tokens().

Functions

static t_token *nth_token(t_list *head, int n)

Return the nth token (0-based) from a token list.

Parameters:
  • head – Head of the list returned by lexer_tokenize().

  • n – Zero-based index of the desired node.

Returns:

Pointer to the t_token, or NULL if the list is shorter than n.

static void assert_token_type(t_list *tokens, int idx, t_token_type expected)
static void test_token_new_basic(void)

token_new allocates a node and sets all three fields correctly.

Verifies that the returned list node is non-NULL, that its content pointer is non-NULL, and that type, value, and io_number match the arguments.

static void test_token_new_io_number(void)

token_new stores a positive io_number without modification.

static void test_is_operator_true(void)

is_operator returns non-zero for every recognised operator character.

static void test_is_operator_false(void)

is_operator returns zero for ordinary characters.

static void test_is_operator_start_simple(void)

is_operator_start detects a bare operator at the start of a string.

static void test_is_operator_start_io_number(void)

is_operator_start detects a digit-prefixed redirect (e.g. “2>”).

static void test_is_operator_start_digit_not_redir(void)

is_operator_start returns zero when digits are not followed by a redirect character.

static void test_is_operator_start_word(void)

is_operator_start returns zero for plain words.

static void test_read_word_simple(void)

read_word consumes a simple unquoted word and stops at whitespace.

static void test_read_word_single_quote(void)

read_word includes the entire single-quoted span as one token.

Spaces inside single quotes must not terminate the word.

static void test_read_word_double_quote(void)

read_word includes the entire double-quoted span as one token.

static void test_read_word_backslash_escape(void)

read_word respects a backslash-escaped space outside quotes.

“hello\ world” should be consumed as a single token.

static void test_read_word_stops_at_operator(void)

read_word stops at an operator character.

static void test_read_word_pipe_in_single_quote(void)

A pipe inside single quotes is NOT an operator boundary.

static t_token_type operator_type(const char *s)

Helper: tokenise a single operator string and return its type.

static int read_operator_crashes(const char *s)
static void test_read_operator_single_char(void)

read_operator correctly classifies every single-character operator.

static void test_read_operator_double_char(void)

read_operator correctly classifies every two-character operator.

static void test_read_operator_io_number(void)

read_operator extracts an io_number from a digit-prefixed redirect.

“2>” must produce TOK_REDIR_OUT with io_number == 2.

static void test_read_operator_malformed_no_crash(void)

read_operator must not crash on malformed digit-prefixed non-operator input (e.g. “42x”).

static void test_tokenize_simple_command(void)

“echo hello” produces WORD WORD EOF (3 tokens).

static void test_tokenize_pipe(void)

“echo | cat” produces WORD PIPE WORD EOF (4 tokens).

static void test_tokenize_redirect_out(void)

“echo > file” produces WORD REDIR_OUT WORD EOF (4 tokens).

static void test_tokenize_heredoc(void)

“cat << EOF” produces WORD HEREDOC WORD EOF (4 tokens).

static void test_tokenize_heredoc_strip(void)

“cat <<- EOF” produces WORD HEREDOC_STRIP WORD EOF (4 tokens).

static void test_tokenize_empty(void)

An empty string produces only a single TOK_EOF token.

static void test_tokenize_whitespace_trimmed(void)

Leading and trailing spaces are ignored.

“ echo “ should yield the same token stream as “echo”.

static void test_tokenize_and_or(void)

“cmd && other || fallback” produces the correct AND/OR tokens.

static void test_tokenize_quoted_operators(void)

Quoted strings containing operators are treated as a single WORD.

“echo ‘|; <>’” must not produce any operator tokens.

static void test_tokenize_io_number(void)

A digit-prefixed redirect stores the correct io_number on the token.

“cmd 2>>err” must produce io_number == 2 on the TOK_REDIR_APPEND token.

static void test_tokenize_subshell(void)

Subshell syntax “(echo hi)” produces LPAREN WORD WORD RPAREN EOF.

static void test_tokenize_newline_separator(void)

Newline is a shell command separator and must be tokenized.

“echo\ncat” should produce WORD NEWLINE WORD EOF.

void test_lexer_suite(void)

Run all lexer unit tests.

Called by test_runner.c via MU_RUN(test_lexer_suite).