LET'S BUILD A COMPILER!(4)---第三部分:再论表达式续
空白字符
结束本章之前,我们再来讨论一下空白字符的问题。现在这个版本的分析器会在读到一个空白字符的地方停下来。这是相当不友好的行为。所以让我们消除这个最后的限制,使分析器的表现更有商业产品的味道。
处理空白字符的关键在于制定一条规则,规定分析器改如何处理对待输入的空白字符,并在整个分析器中都遵守它。目前为止,空白字符还是不被允许的,我们可以假定在每个分析动作之后,先行字符Look都包含的是个有意义的字符,然后可以立即验证这点。我们设计的方法就是基于上述原则。
具体来说,就是每个需要先行读取输入流的过程都必须忽略空白字符,并且最后Look变量中只能保留非空白字符。所幸,我们是用GetName,GetNum和Match完成大部分了输入操作,所以只有三个过程(再加上Init过程)需要修改。
首先,定义一个新的识别空白字符的过程。
{--------------------------------------------------------------}
{ Recognize White Space }
function IsWhite(c: char): boolean;
begin
IsWhite := c in [' ', TAB];
end;
{--------------------------------------------------------------}
还需要一个过程“吃掉”空白字符,直到遇见一个非空白字符。
{--------------------------------------------------------------}
{ Skip Over Leading White Space }
procedure SkipWhite;
begin
while IsWhite(Look) do
GetChar;
end;
{--------------------------------------------------------------}
现在,在Match,GetName和GetNum过程中调用SkipWhite:
{--------------------------------------------------------------}
{ Match a Specific Input Character }
procedure Match(x: char);
begin
if Look <> x then Expected('''' + x + '''')
else begin
GetChar;
SkipWhite;
end;
end;
{--------------------------------------------------------------}
{ Get an Identifier }
function GetName: string;
var Token: string;
begin
Token := '';
if not IsAlpha(Look) then Expected('Name');
while IsAlNum(Look) do begin
Token := Token + UpCase(Look);
GetChar;
end;
GetName := Token;
SkipWhite;
end;
{--------------------------------------------------------------}
{ Get a Number }
function GetNum: string;
var Value: string;
begin
Value := '';
if not IsDigit(Look) then Expected('Integer');
while IsDigit(Look) do begin
Value := Value + Look;
GetChar;
end;
GetNum := Value;
SkipWhite;
end;
{--------------------------------------------------------------}
(注意,我对Match过程进行了重新组织,但是没有改变它的功能。)
最后,还要在Init中略过源文件开头的空白符。
{--------------------------------------------------------------}
{ Initialize }
procedure Init;
begin
GetChar;
SkipWhite;
end;
{--------------------------------------------------------------}
如上改写程序并编译,注意把Match过程放在SkipWhite的后面,否则通不过编译。测试这个新版本,以保证它能正常运行。
由于本章中我们对原来的程序做了不少的修改,现在我在下面列出修改后的完整程序:
{--------------------------------------------------------------}
program parse;
{--------------------------------------------------------------}
{ Constant Declarations }
const TAB = ^I;
CR = ^M;
{--------------------------------------------------------------}
{ Variable Declarations }
var Look: char; { Lookahead Character }
{--------------------------------------------------------------}
{ Read New Character From Input Stream }
procedure GetChar;
begin
Read(Look);
end;
{--------------------------------------------------------------}
{ Report an Error }
procedure Error(s: string);
begin
WriteLn;
WriteLn(^G, 'Error: ', s, '.');
end;
{--------------------------------------------------------------}
{ Report Error and Halt }
procedure Abort(s: string);
begin
Error(s);
Halt;
end;
{--------------------------------------------------------------}
{ Report What Was Expected }
procedure Expected(s: string);
begin
Abort(s + ' Expected');
end;
{--------------------------------------------------------------}
{ Recognize an Alpha Character }
function IsAlpha(c: char): boolean;
begin
IsAlpha := UpCase(c) in ['A'..'Z'];
end;
{--------------------------------------------------------------}
{ Recognize a Decimal Digit }
function IsDigit(c: char): boolean;
begin
IsDigit := c in ['0'..'9'];
end;
{--------------------------------------------------------------}
{ Recognize an Alphanumeric }
function IsAlNum(c: char): boolean;
begin
IsAlNum := IsAlpha(c) or IsDigit(c);
end;
{--------------------------------------------------------------}
{ Recognize an Addop }
function IsAddop(c: char): boolean;
begin
IsAddop := c in ['+', '-'];
end;
{--------------------------------------------------------------}
{ Recognize White Space }
function IsWhite(c: char): boolean;
begin
IsWhite := c in [' ', TAB];
end;
{--------------------------------------------------------------}
{ Skip Over Leading White Space }
procedure SkipWhite;
begin
while IsWhite(Look) do
GetChar;
end;
{--------------------------------------------------------------}
{ Match a Specific Input Character }
procedure Match(x: char);
begin
if Look <> x then Expected('''' + x + '''')
else begin
GetChar;
SkipWhite;
end;
end;
{--------------------------------------------------------------}
{ Get an Identifier }
function GetName: string;
var Token: string;
begin
Token := '';
if not IsAlpha(Look) then Expected('Name');
while IsAlNum(Look) do begin
Token := Token + UpCase(Look);
GetChar;
end;
GetName := Token;
SkipWhite;
end;
{--------------------------------------------------------------}
{ Get a Number }
function GetNum: string;
var Value: string;
begin
Value := '';
if not IsDigit(Look) then Expected('Integer');
while IsDigit(Look) do begin
Value := Value + Look;
GetChar;
end;
GetNum := Value;
SkipWhite;
end;
{--------------------------------------------------------------}
{ Output a String with Tab }
procedure Emit(s: string);
begin
Write(TAB, s);
end;
{--------------------------------------------------------------}
{ Output a String with Tab and CRLF }
procedure EmitLn(s: string);
begin
Emit(s);
WriteLn;
end;
{---------------------------------------------------------------}
{ Parse and Translate a Identifier }
procedure Ident;
var Name: string[8];
begin
Name:= GetName;
if Look = '(' then begin
Match('(');
Match(')');
EmitLn('BSR ' + Name);
end
else
EmitLn('MOVE ' + Name + '(PC),D0');
end;
{---------------------------------------------------------------}
{ Parse and Translate a Math Factor }
procedure Expression; Forward;
procedure Factor;
begin
if Look = '(' then begin
Match('(');
Expression;
Match(')');
end
else if IsAlpha(Look) then
Ident
else
EmitLn('MOVE #' + GetNum + ',D0');
end;
{--------------------------------------------------------------}
{ Recognize and Translate a Multiply }
procedure Multiply;
begin
Match('*');
Factor;
EmitLn('MULS (SP)+,D0');
end;
{-------------------------------------------------------------}
{ Recognize and Translate a Divide }
procedure Divide;
begin
Match('/');
Factor;
EmitLn('MOVE (SP)+,D1');
EmitLn('EXS.L D0');
EmitLn('DIVS D1,D0');
end;
{---------------------------------------------------------------}
{ Parse and Translate a Math Term }
procedure Term;
begin
Factor;
while Look in ['*', '/'] do begin
EmitLn('MOVE D0,-(SP)');
case Look of
'*': Multiply;
'/': Divide;
end;
end;
end;
{--------------------------------------------------------------}
{ Recognize and Translate an Add }
procedure Add;
begin
Match('+');
Term;
EmitLn('ADD (SP)+,D0');
end;
{-------------------------------------------------------------}
{ Recognize and Translate a Subtract }
procedure Subtract;
begin
Match('-');
Term;
EmitLn('SUB (SP)+,D0');
EmitLn('NEG D0');
end;
{---------------------------------------------------------------}
{ Parse and Translate an Expression }
procedure Expression;
begin
if IsAddop(Look) then
EmitLn('CLR D0')
else
Term;
while IsAddop(Look) do begin
EmitLn('MOVE D0,-(SP)');
case Look of
'+': Add;
'-': Subtract;
end;
end;
end;
{--------------------------------------------------------------}
{ Parse and Translate an Assignment Statement }
procedure Assignment;
var Name: string[8];
begin
Name := GetName;
Match('=');
Expression;
EmitLn('LEA ' + Name + '(PC),A0');
EmitLn('MOVE D0,(A0)')
end;
{--------------------------------------------------------------}
{ Initialize }
procedure Init;
begin
GetChar;
SkipWhite;
end;
{--------------------------------------------------------------}
{ Main Program }
begin
Init;
Assignment;
If Look <> CR then Expected('NewLine');
end.
{--------------------------------------------------------------}
上面已经是完整的分析器了。它已经具有了一个“行编译器”的所有特征。建议你把它保存起来。下次我们将转向一个新主题,但是仍然要谈一点表达式的问题。下一章,我计划介绍一些关于解释器(interpreter)的知识,并展示当我们改变分析器的动作时,它的结构是如何跟着改动的。我们在下一章中要学到的东西以后还会用到的,即使你对解释器不感兴趣。下章见
(未完待续)