You are browsing a version that is no longer maintained. |
Simple Parser Example
Extend the Doctrine\Common\Lexer\AbstractLexer
class and implement
the getCatchablePatterns
, getNonCatchablePatterns
, and getType
methods. Here is a very simple example lexer implementation named CharacterTypeLexer
.
It tokenizes a string to T_UPPER
, T_LOWER
andT_NUMBER
tokens:
1 <?php
use Doctrine\Common\Lexer\AbstractLexer;
class CharacterTypeLexer extends AbstractLexer
{
const T_UPPER = 1;
const T_LOWER = 2;
const T_NUMBER = 3;
protected function getCatchablePatterns()
{
return array(
'[a-bA-Z0-9]',
);
}
protected function getNonCatchablePatterns()
{
return array();
}
protected function getType(&$value)
{
if (is_numeric($value)) {
return self::T_NUMBER;
}
if (strtoupper($value) === $value) {
return self::T_UPPER;
}
if (strtolower($value) === $value) {
return self::T_LOWER;
}
}
}
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Use CharacterTypeLexer
to extract an array of upper case characters:
1 <?php
class UpperCaseCharacterExtracter
{
private $lexer;
public function __construct(CharacterTypeLexer $lexer)
{
$this->lexer = $lexer;
}
public function getUpperCaseCharacters($string)
{
$this->lexer->setInput($string);
$this->lexer->moveNext();
$upperCaseChars = array();
while (true) {
if (!$this->lexer->lookahead) {
break;
}
$this->lexer->moveNext();
if ($this->lexer->token['type'] === CharacterTypeLexer::T_UPPER) {
$upperCaseChars[] = $this->lexer->token['value'];
}
}
return $upperCaseChars;
}
}
$upperCaseCharacterExtractor = new UpperCaseCharacterExtracter(new CharacterTypeLexer());
$upperCaseCharacters = $upperCaseCharacterExtractor->getUpperCaseCharacters('1aBcdEfgHiJ12');
print_r($upperCaseCharacters);
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
The variable $upperCaseCharacters
contains all of the upper case
characters:
This is a simple example but it should demonstrate the low level API that can be used to build more complex parsers.