LET'S BUILD A COMPILER!(5) - 王朝网络宽屏版

LET'S BUILD A COMPILER!(4)---第三部分：再论表达式续

空白字符

结束本章之前，我们再来讨论一下空白字符的问题。现在这个版本的分析器会在读到一个空白字符的地方停下来。这是相当不友好的行为。所以让我们消除这个最后的限制，使分析器的表现更有商业产品的味道。

处理空白字符的关键在于制定一条规则，规定分析器改如何处理对待输入的空白字符，并在整个分析器中都遵守它。目前为止，空白字符还是不被允许的，我们可以假定在每个分析动作之后，先行字符Look都包含的是个有意义的字符，然后可以立即验证这点。我们设计的方法就是基于上述原则。

具体来说，就是每个需要先行读取输入流的过程都必须忽略空白字符，并且最后Look变量中只能保留非空白字符。所幸，我们是用GetName，GetNum和Match完成大部分了输入操作，所以只有三个过程（再加上Init过程）需要修改。

首先，定义一个新的识别空白字符的过程。

{--------------------------------------------------------------}

{ Recognize White Space }

function IsWhite(c: char): boolean;

begin

IsWhite := c in [' ', TAB];

end;

{--------------------------------------------------------------}

还需要一个过程“吃掉”空白字符，直到遇见一个非空白字符。

{--------------------------------------------------------------}

{ Skip Over Leading White Space }

procedure SkipWhite;

begin

while IsWhite(Look) do

GetChar;

end;

{--------------------------------------------------------------}

现在，在Match，GetName和GetNum过程中调用SkipWhite：

{--------------------------------------------------------------}

{ Match a Specific Input Character }

procedure Match(x: char);

begin

if Look <> x then Expected('''' + x + '''')

else begin

GetChar;

SkipWhite;

end;

{--------------------------------------------------------------}

{ Get an Identifier }

function GetName: string;

var Token: string;

begin

Token := '';

if not IsAlpha(Look) then Expected('Name');

while IsAlNum(Look) do begin

Token := Token + UpCase(Look);

GetChar;

end;

GetName := Token;

SkipWhite;

end;

{--------------------------------------------------------------}

{ Get a Number }

function GetNum: string;

var Value: string;

begin

Value := '';

if not IsDigit(Look) then Expected('Integer');

while IsDigit(Look) do begin

Value := Value + Look;

GetChar;

end;

GetNum := Value;

SkipWhite;

end;

{--------------------------------------------------------------}

（注意，我对Match过程进行了重新组织，但是没有改变它的功能。）

最后，还要在Init中略过源文件开头的空白符。

{--------------------------------------------------------------}

{ Initialize }

procedure Init;

begin

GetChar;

SkipWhite;

end;

{--------------------------------------------------------------}

如上改写程序并编译，注意把Match过程放在SkipWhite的后面，否则通不过编译。测试这个新版本，以保证它能正常运行。

由于本章中我们对原来的程序做了不少的修改，现在我在下面列出修改后的完整程序：

{--------------------------------------------------------------}

program parse;

{--------------------------------------------------------------}

{ Constant Declarations }

const TAB = ^I;

CR = ^M;

{--------------------------------------------------------------}

{ Variable Declarations }

var Look: char; { Lookahead Character }

{--------------------------------------------------------------}

{ Read New Character From Input Stream }

procedure GetChar;

begin

Read(Look);

end;

{--------------------------------------------------------------}

{ Report an Error }

procedure Error(s: string);

begin

WriteLn;

WriteLn(^G, 'Error: ', s, '.');

end;

{--------------------------------------------------------------}

{ Report Error and Halt }

procedure Abort(s: string);

begin

Error(s);

Halt;

end;

{--------------------------------------------------------------}

{ Report What Was Expected }

procedure Expected(s: string);

begin

Abort(s + ' Expected');

end;

{--------------------------------------------------------------}

{ Recognize an Alpha Character }

function IsAlpha(c: char): boolean;

begin

IsAlpha := UpCase(c) in ['A'..'Z'];

end;

{--------------------------------------------------------------}

{ Recognize a Decimal Digit }

function IsDigit(c: char): boolean;

begin

IsDigit := c in ['0'..'9'];

end;

{--------------------------------------------------------------}

{ Recognize an Alphanumeric }

function IsAlNum(c: char): boolean;

begin

IsAlNum := IsAlpha(c) or IsDigit(c);

end;

{--------------------------------------------------------------}

{ Recognize an Addop }

function IsAddop(c: char): boolean;

begin

IsAddop := c in ['+', '-'];

end;

{--------------------------------------------------------------}

{ Recognize White Space }

function IsWhite(c: char): boolean;

begin

IsWhite := c in [' ', TAB];

end;

{--------------------------------------------------------------}

{ Skip Over Leading White Space }

procedure SkipWhite;

begin

while IsWhite(Look) do

GetChar;

end;

{--------------------------------------------------------------}

{ Match a Specific Input Character }

procedure Match(x: char);

begin

if Look <> x then Expected('''' + x + '''')

else begin

GetChar;

SkipWhite;

end;

{--------------------------------------------------------------}

{ Get an Identifier }

function GetName: string;

var Token: string;

begin

Token := '';

if not IsAlpha(Look) then Expected('Name');

while IsAlNum(Look) do begin

Token := Token + UpCase(Look);

GetChar;

end;

GetName := Token;

SkipWhite;

end;

{--------------------------------------------------------------}

{ Get a Number }

function GetNum: string;

var Value: string;

begin

Value := '';

if not IsDigit(Look) then Expected('Integer');

while IsDigit(Look) do begin

Value := Value + Look;

GetChar;

end;

GetNum := Value;

SkipWhite;

end;

{--------------------------------------------------------------}

{ Output a String with Tab }

procedure Emit(s: string);

begin

Write(TAB, s);

end;

{--------------------------------------------------------------}

{ Output a String with Tab and CRLF }

procedure EmitLn(s: string);

begin

Emit(s);

WriteLn;

end;

{---------------------------------------------------------------}

{ Parse and Translate a Identifier }

procedure Ident;

var Name: string[8];

begin

Name:= GetName;

if Look = '(' then begin

Match('(');

Match(')');

EmitLn('BSR ' + Name);

end

else

EmitLn('MOVE ' + Name + '(PC),D0');

end;

{---------------------------------------------------------------}

{ Parse and Translate a Math Factor }

procedure Expression; Forward;

procedure Factor;

begin

if Look = '(' then begin

Match('(');

Expression;

Match(')');

end

else if IsAlpha(Look) then

Ident

else

EmitLn('MOVE #' + GetNum + ',D0');

end;

{--------------------------------------------------------------}

{ Recognize and Translate a Multiply }

procedure Multiply;

begin

Match('*');

Factor;

EmitLn('MULS (SP)+,D0');

end;

{-------------------------------------------------------------}

{ Recognize and Translate a Divide }

procedure Divide;

begin

Match('/');

Factor;

EmitLn('MOVE (SP)+,D1');

EmitLn('EXS.L D0');

EmitLn('DIVS D1,D0');

end;

{---------------------------------------------------------------}

{ Parse and Translate a Math Term }

procedure Term;

begin

Factor;

while Look in ['*', '/'] do begin

EmitLn('MOVE D0,-(SP)');

case Look of

'*': Multiply;

'/': Divide;

end;

{--------------------------------------------------------------}

{ Recognize and Translate an Add }

procedure Add;

begin

Match('+');

Term;

EmitLn('ADD (SP)+,D0');

end;

{-------------------------------------------------------------}

{ Recognize and Translate a Subtract }

procedure Subtract;

begin

Match('-');

Term;

EmitLn('SUB (SP)+,D0');

EmitLn('NEG D0');

end;

{---------------------------------------------------------------}

{ Parse and Translate an Expression }

procedure Expression;

begin

if IsAddop(Look) then

EmitLn('CLR D0')

else

Term;

while IsAddop(Look) do begin

EmitLn('MOVE D0,-(SP)');

case Look of

'+': Add;

'-': Subtract;

end;

{--------------------------------------------------------------}

{ Parse and Translate an Assignment Statement }

procedure Assignment;

var Name: string[8];

begin

Name := GetName;

Match('=');

Expression;

EmitLn('LEA ' + Name + '(PC),A0');

EmitLn('MOVE D0,(A0)')

end;

{--------------------------------------------------------------}

{ Initialize }

procedure Init;

begin

GetChar;

SkipWhite;

end;

{--------------------------------------------------------------}

{ Main Program }

begin

Init;

Assignment;

If Look <> CR then Expected('NewLine');

end.

{--------------------------------------------------------------}

上面已经是完整的分析器了。它已经具有了一个“行编译器”的所有特征。建议你把它保存起来。下次我们将转向一个新主题，但是仍然要谈一点表达式的问题。下一章，我计划介绍一些关于解释器（interpreter）的知识，并展示当我们改变分析器的动作时，它的结构是如何跟着改动的。我们在下一章中要学到的东西以后还会用到的，即使你对解释器不感兴趣。下章见

(未完待续)