Lexical Tokenizer

Defines

#define U_TOKEN_SZ   128
 Maximum size of input tokens (can be changed at compile time via -DU_TOKEN_SZ=nnn compiler flag).
#define U_LEXER_ERR(l,...)
 u_lexer_pos wrapper.
#define U_LEXER_SKIP(l, pc)
 u_lexer_skip wrapper.
#define U_LEXER_NEXT(l, pc)
 u_lexer_next wrapper.

Typedefs

typedef struct u_lexer_s u_lexer_t
 Lexer base type.

Functions

int u_lexer_new (const char *s, u_lexer_t **pl)
 Create a new lexer context associated to the NUL-terminated string s.
void u_lexer_free (u_lexer_t *l)
 Destroy a previously allocated lexer context.
const char * u_lexer_geterr (u_lexer_t *l)
 Accessor method for lexer error string.
int u_lexer_next (u_lexer_t *l, char *pb)
 Get next char (included whitespaces) and advance lexer position by one.
int u_lexer_skip (u_lexer_t *l, char *pb)
 Get next non-whitespace char and advance lexer position accordingly.
char u_lexer_peek (u_lexer_t *l)
 Get the character actually under the lexer cursor.
int u_lexer_seterr (u_lexer_t *l, const char *fmt,...)
 Setter method for lexer error string.
void u_lexer_record_lmatch (u_lexer_t *l)
 Register left side (i.e. beginning) of a match.
void u_lexer_record_rmatch (u_lexer_t *l)
 Register right side (i.e. closing) of the current match.
char * u_lexer_get_match (u_lexer_t *l, char match[U_TOKEN_SZ])
 Extract the matched sub-string.
int u_lexer_eot (u_lexer_t *l)
 Tell if we've reached the end of the string.
int u_lexer_eat_ws (u_lexer_t *l)
 Advance lexer position until a non-whitespace char is found.
int u_lexer_expect_char (u_lexer_t *l, char expected)
 Expect to find the supplied character under lexer cursor.
size_t u_lexer_pos (u_lexer_t *l)
 Return the actual position of the lexer cursor.
const char * u_lexer_lookahead (u_lexer_t *l)
 Return a pointer to the substring that has not yet been parsed.

Function Documentation

int u_lexer_eat_ws ( u_lexer_t l  ) 
Parameters:
l An active lexer context.
Return values:
0 on success
-1 on end of text

Definition at line 191 of file srcs/toolbox/lexer.c.

References u_lexer_eot().

int u_lexer_eot ( u_lexer_t l  ) 
Parameters:
l Handler of an active lexer context.
Return values:
0 if actual position is not at the end of text
1 if actual position is at the end of text

Definition at line 178 of file srcs/toolbox/lexer.c.

Referenced by u_lexer_eat_ws().

int u_lexer_expect_char ( u_lexer_t l,
char  expected 
)
Parameters:
l An active lexer context.
expected The character that we expect to be found under the lexer cursor.
Return values:
0 on success
~0 on failure

Definition at line 276 of file srcs/toolbox/lexer.c.

References U_LEXER_ERR, U_LEXER_NEXT, and u_lexer_peek().

void u_lexer_free ( u_lexer_t l  ) 
Parameters:
l Handler to a previously created u_lexer_t object.
Returns:
nothing

Definition at line 90 of file srcs/toolbox/lexer.c.

References u_free().

Referenced by u_lexer_new(), and u_uri_crumble().

char * u_lexer_get_match ( u_lexer_t l,
char  match[U_TOKEN_SZ] 
)
Parameters:
l An active lexer context.
match A buffer of at least U_TOKEN_SZ bytes that will hold the matched NUL-terminated substring.
Returns:
the matched substring

Definition at line 252 of file srcs/toolbox/lexer.c.

const char * u_lexer_geterr ( u_lexer_t l  ) 
Parameters:
l Handler of a lexer context that may have generated an error.
Returns:
the error string (if any)

Definition at line 110 of file srcs/toolbox/lexer.c.

const char * u_lexer_lookahead ( u_lexer_t l  ) 
Parameters:
l An active lexer context.
Returns:
a pointer to the substring still to be parsed

Definition at line 78 of file srcs/toolbox/lexer.c.

int u_lexer_new ( const char *  s,
u_lexer_t **  pl 
)
Parameters:
s Pointer to the string that has to be parsed.
pl Handler for the associated lexer instance as a result argument.
Return values:
0 on success
~0 on failure

Definition at line 42 of file srcs/toolbox/lexer.c.

References u_lexer_free(), u_strdup(), and u_zalloc().

Referenced by u_uri_crumble().

int u_lexer_next ( u_lexer_t l,
char *  pb 
)
Parameters:
l Handler of an active lexer context.
pb If non-NULL it will get the next stream character.
Returns:
See u_lexer_next_ex return codes

Definition at line 150 of file srcs/toolbox/lexer.c.

char u_lexer_peek ( u_lexer_t l  ) 
Parameters:
l An active lexer context.
Returns:
the character actually under the cursor.

Definition at line 212 of file srcs/toolbox/lexer.c.

Referenced by u_lexer_expect_char().

size_t u_lexer_pos ( u_lexer_t l  ) 
Parameters:
l An active lexer context.
Returns:
the actual string offset of the lexer cursor.

Definition at line 299 of file srcs/toolbox/lexer.c.

void u_lexer_record_lmatch ( u_lexer_t l  ) 
Parameters:
l An active lexer context.
Returns:
nothing

Definition at line 224 of file srcs/toolbox/lexer.c.

void u_lexer_record_rmatch ( u_lexer_t l  ) 
Parameters:
l An active lexer context.
Returns:
nothing

Definition at line 237 of file srcs/toolbox/lexer.c.

int u_lexer_seterr ( u_lexer_t l,
const char *  fmt,
  ... 
)
Parameters:
l Handler of an active lexer context.
fmt printf-like format string.
... variable argument list to feed fmt.
Return values:
0 on success
~0 on failure (bad parameters supplied)

Definition at line 127 of file srcs/toolbox/lexer.c.

int u_lexer_skip ( u_lexer_t l,
char *  pb 
)
Parameters:
l Handler of an active lexer context.
pb If non-NULL it will get the next non-whitespace characted from the stream.
Returns:
See u_lexer_next_ex return codes

Definition at line 165 of file srcs/toolbox/lexer.c.


←Products
© 2005-2012 - KoanLogic S.r.l. - All rights reserved