[ Index ]

PHP Cross Reference of Limb3

title

Body

[close]

/tests_runner/lib/simpletest/ -> parser.php (summary)

base include file for SimpleTest

Version: $Id: parser.php 5999 2007-06-18 13:13:08Z pachanga $
File Size: 775 lines (30 kb)
Included or required:0 times
Referenced: 0 times
Includes or requires: 0 files

Defines 6 classes

ParallelRegex:: (5 methods):
  ParallelRegex()
  addPattern()
  match()
  _getCompoundedRegex()
  _getPerlMatchingFlags()

SimpleStateStack:: (4 methods):
  SimpleStateStack()
  getCurrent()
  enter()
  leave()

SimpleLexer:: (13 methods):
  SimpleLexer()
  addPattern()
  addEntryPattern()
  addExitPattern()
  addSpecialPattern()
  mapHandler()
  parse()
  _dispatchTokens()
  _isModeEnd()
  _isSpecialMode()
  _decodeSpecial()
  _invokeParser()
  _reduce()

SimpleHtmlLexer:: (6 methods):
  SimpleHtmlLexer()
  _getParsedTags()
  _addSkipping()
  _addTag()
  _addInTagTokens()
  _addAttributeTokens()

SimpleHtmlSaxParser:: (11 methods):
  SimpleHtmlSaxParser()
  parse()
  createLexer()
  acceptStartToken()
  acceptEndToken()
  acceptAttributeToken()
  acceptEntityToken()
  acceptTextToken()
  ignore()
  decodeHtml()
  normalise()

SimpleSaxListener:: (4 methods):
  SimpleSaxListener()
  startElement()
  endElement()
  addContent()


Class: ParallelRegex  - X-Ref

Compounded regular expression. Any of
the contained patterns could match and
when one does, it's label is returned.

ParallelRegex($case)   X-Ref
Constructor. Starts with no patterns.

param: boolean $case    True for case sensitive, false

addPattern($pattern, $label = true)   X-Ref
Adds a pattern with an optional label.

param: string $pattern      Perl style regex, but ( and )
param: string $label        Label of regex to be returned

match($subject, &$match)   X-Ref
Attempts to match all patterns at once against
a string.

param: string $subject      String to match against.
param: string $match        First matched portion of
return: boolean             True on success.

_getCompoundedRegex()   X-Ref
Compounds the patterns into a single
regular expression separated with the
"or" operator. Caches the regex.
Will automatically escape (, ) and / tokens.

param: array $patterns    List of patterns in order.

_getPerlMatchingFlags()   X-Ref
Accessor for perl regex mode flags to use.

return: string       Perl regex flags.

Class: SimpleStateStack  - X-Ref

States for a stack machine.

SimpleStateStack($start)   X-Ref
Constructor. Starts in named state.

param: string $start        Starting state name.

getCurrent()   X-Ref
Accessor for current state.

return: string       State.

enter($state)   X-Ref
Adds a state to the stack and sets it
to be the current state.

param: string $state        New state.

leave()   X-Ref
Leaves the current state and reverts
to the previous one.

return: boolean    False if we drop off

Class: SimpleLexer  - X-Ref

Accepts text and breaks it into tokens.
Some optimisation to make the sure the
content is only scanned by the PHP regex
parser once. Lexer modes must not start
with leading underscores.

SimpleLexer(&$parser, $start = "accept", $case = false)   X-Ref
Sets up the lexer in case insensitive matching
by default.

param: SimpleSaxParser $parser  Handling strategy by
param: string $start            Starting handler.
param: boolean $case            True for case sensitive.

addPattern($pattern, $mode = "accept")   X-Ref
Adds a token search pattern for a particular
parsing mode. The pattern does not change the
current mode.

param: string $pattern      Perl style regex, but ( and )
param: string $mode         Should only apply this

addEntryPattern($pattern, $mode, $new_mode)   X-Ref
Adds a pattern that will enter a new parsing
mode. Useful for entering parenthesis, strings,
tags, etc.

param: string $pattern      Perl style regex, but ( and )
param: string $mode         Should only apply this
param: string $new_mode     Change parsing to this new

addExitPattern($pattern, $mode)   X-Ref
Adds a pattern that will exit the current mode
and re-enter the previous one.

param: string $pattern      Perl style regex, but ( and )
param: string $mode         Mode to leave.

addSpecialPattern($pattern, $mode, $special)   X-Ref
Adds a pattern that has a special mode. Acts as an entry
and exit pattern in one go, effectively calling a special
parser handler for this token only.

param: string $pattern      Perl style regex, but ( and )
param: string $mode         Should only apply this
param: string $special      Use this mode for this one token.

mapHandler($mode, $handler)   X-Ref
Adds a mapping from a mode to another handler.

param: string $mode        Mode to be remapped.
param: string $handler     New target handler.

parse($raw)   X-Ref
Splits the page text into tokens. Will fail
if the handlers report an error or if no
content is consumed. If successful then each
unparsed and parsed token invokes a call to the
held listener.

param: string $raw        Raw HTML text.
return: boolean           True on success, else false.

_dispatchTokens($unmatched, $matched, $mode = false)   X-Ref
Sends the matched token and any leading unmatched
text to the parser changing the lexer to a new
mode if one is listed.

param: string $unmatched    Unmatched leading portion.
param: string $matched      Actual token match.
param: string $mode         Mode after match. A boolean
return: boolean             False if there was any error

_isModeEnd($mode)   X-Ref
Tests to see if the new mode is actually to leave
the current mode and pop an item from the matching
mode stack.

param: string $mode    Mode to test.
return: boolean        True if this is the exit mode.

_isSpecialMode($mode)   X-Ref
Test to see if the mode is one where this mode
is entered for this token only and automatically
leaves immediately afterwoods.

param: string $mode    Mode to test.
return: boolean        True if this is the exit mode.

_decodeSpecial($mode)   X-Ref
Strips the magic underscore marking single token
modes.

param: string $mode    Mode to decode.
return: string         Underlying mode name.

_invokeParser($content, $is_match)   X-Ref
Calls the parser method named after the current
mode. Empty content will be ignored. The lexer
has a parser handler for each mode in the lexer.

param: string $content        Text parsed.
param: boolean $is_match      Token is recognised rather

_reduce($raw)   X-Ref
Tries to match a chunk of text and if successful
removes the recognised chunk and any leading
unparsed data. Empty strings will not be matched.

param: string $raw         The subject to parse. This is the
return: array/boolean      Three item list of unparsed

Class: SimpleHtmlLexer  - X-Ref

Breas HTML into SAX events.

SimpleHtmlLexer(&$parser)   X-Ref
Sets up the lexer with case insensitive matching
and adds the HTML handlers.

param: SimpleSaxParser $parser  Handling strategy by

_getParsedTags()   X-Ref
List of parsed tags. Others are ignored.

return: array        List of searched for tags.

_addSkipping()   X-Ref
The lexer has to skip certain sections such
as server code, client code and styles.


_addTag($tag)   X-Ref
Pattern matches to start and end a tag.

param: string $tag          Name of tag to scan for.

_addInTagTokens()   X-Ref
Pattern matches to parse the inside of a tag
including the attributes and their quoting.


_addAttributeTokens()   X-Ref
Matches attributes that are either single quoted,
double quoted or unquoted.


Class: SimpleHtmlSaxParser  - X-Ref

Converts HTML tokens into selected SAX events.

SimpleHtmlSaxParser(&$listener)   X-Ref
Sets the listener.

param: SimpleSaxListener $listener    SAX event handler.

parse($raw)   X-Ref
Runs the content through the lexer which
should call back to the acceptors.

param: string $raw      Page text to parse.
return: boolean         False if parse error.

createLexer(&$parser)   X-Ref
Sets up the matching lexer. Starts in 'text' mode.

param: SimpleSaxParser $parser    Event generator, usually $self.
return: SimpleLexer               Lexer suitable for this parser.

acceptStartToken($token, $event)   X-Ref
Accepts a token from the tag mode. If the
starting element completes then the element
is dispatched and the current attributes
set back to empty. The element or attribute
name is converted to lower case.

param: string $token     Incoming characters.
param: integer $event    Lexer event type.
return: boolean          False if parse error.

acceptEndToken($token, $event)   X-Ref
Accepts a token from the end tag mode.
The element name is converted to lower case.

param: string $token     Incoming characters.
param: integer $event    Lexer event type.
return: boolean          False if parse error.

acceptAttributeToken($token, $event)   X-Ref
Part of the tag data.

param: string $token     Incoming characters.
param: integer $event    Lexer event type.
return: boolean          False if parse error.

acceptEntityToken($token, $event)   X-Ref
A character entity.

param: string $token    Incoming characters.
param: integer $event   Lexer event type.
return: boolean         False if parse error.

acceptTextToken($token, $event)   X-Ref
Character data between tags regarded as
important.

param: string $token     Incoming characters.
param: integer $event    Lexer event type.
return: boolean          False if parse error.

ignore($token, $event)   X-Ref
Incoming data to be ignored.

param: string $token     Incoming characters.
param: integer $event    Lexer event type.
return: boolean          False if parse error.

decodeHtml($html)   X-Ref
Decodes any HTML entities.

param: string $html    Incoming HTML.
return: string         Outgoing plain text.

normalise($html)   X-Ref
Turns HTML into text browser visible text. Images
are converted to their alt text and tags are supressed.
Entities are converted to their visible representation.

param: string $html        HTML to convert.
return: string             Plain text.

Class: SimpleSaxListener  - X-Ref

SAX event handler.

SimpleSaxListener()   X-Ref
Sets the document to write to.


startElement($name, $attributes)   X-Ref
Start of element event.

param: string $name        Element name.
param: hash $attributes    Name value pairs.
return: boolean            False on parse error.

endElement($name)   X-Ref
End of element event.

param: string $name        Element name.
return: boolean            False on parse error.

addContent($text)   X-Ref
Unparsed, but relevant data.

param: string $text        May include unparsed tags.
return: boolean            False on parse error.



Generated: Sat Nov 22 03:48:54 2008 Cross-referenced by PHPXref 0.7