Commits

Anonymous committed 165225e

fixed some typos in cucu blog articles

  • Participants
  • Parent commits 934e35d

Comments (0)

Files changed (3)

File input/blog/cucu-part1.mkd

 		...
 	}
 
-Let's try to write it down in [EBNF][1] form (it's absilutely ok, if you don't
+Let's try to write it down in [EBNF][1] form (it's absolutely ok, if you don't
 know what EBNF is, it's really intuitive):
 
 	<program> ::= { <var-decl> | <func-decl> | <func-def> } ;
 declarations and definitions? Ok, let's go deeper:
 
 	<var-decl> ::= <type> <ident> ";"
-	<func-decl> ::= <type> <ident> "( <func-args> ")" ";"
+	<func-decl> ::= <type> <ident> "(" <func-args> ")" ";"
 	<func-def> ::= <type> <ident> "(" <func-args> ")" <func-body>
 	<func-args> ::= { <type> <ident> "," }
 	<type> ::= "int" | "char *"
 
 	/* These are simple statements */
 	i = 2 + 3; /* assignment statement */
-	my_func(i); /* function call stament */
+	my_func(i); /* function call statement */
 	return i; /* return statement */
 
 	/* These are compound statements */
 	if (x > 0) { .. } else { .. }
 	while (x > 0) { .. }
 
-*Expression* is a smaller part of the statement. As opposed to stamenents,
-expressions always return a value.  Usually, it's just the arithmetics. For
+*Expression* is a smaller part of the statement. As opposed to statements,
+expressions always return a value.  Usually, it's just the arithmetic. For
 example in the statement `func(x[2], i + j)` the expressions are `x[2]` and
 `i+j`.
 
 	<func-body> ::= <statement>
 	<statement> ::= "{" { <statement> } "}"                /* block statement */
 	                | [<type>] <ident> [ "=" <expr> ] ";"  /* assignment */
-									| "return" <expr> ";"
-									| "if" "(" <expr> ")" <statement> [ "else" <statement> ]
-									| "while" "(" <expr> ")" <statement>
-									| <expr> ";"
+	                | "return" <expr> ";"
+	                | "if" "(" <expr> ")" <statement> [ "else" <statement> ]
+	                | "while" "(" <expr> ")" <statement>
+	                | <expr> ";"
 
 Here are possible expressions in CUCU language:
 
 	           | <bitwise-expr> = <expr>
 	<bitwise-expr> ::= <eq-expr>
 	                   | <bitwise-expr> & <eq-expr>
-										 | <bitwise-expr> | <eq-expr>
+	                   | <bitwise-expr> | <eq-expr>
 	<eq-expr> ::= <rel-expr>
 	              | <eq-expr> == <rel-expr>
-								| <eq-expr> != <rel-expr>
+	              | <eq-expr> != <rel-expr>
 	<rel-expr> ::= <shift-expr>
 	               | <rel-expr> < <shift-expr>
 	<shift-expr> ::= <add-expr>
 	                 | <shift-expr> << <add-expr>
-									 | <shift-expr> >> <add-expr>
+	                 | <shift-expr> >> <add-expr>
 	<add-expr> ::= <postfix-expr>
 	               | <add-expr> + <postfix-expr>
-								 | <add-expr> - <postfix-expr>
+	               | <add-expr> - <postfix-expr>
 	<postfix-expr> ::= <prim-expr>
 	                   | <postfix-expr> [ <expr> ]
-										 | <postfix-expr> ( <expr> { "," <expr> } )
+	                   | <postfix-expr> ( <expr> { "," <expr> } )
 	<prim-expr> := <number> | <ident> | <string> | "(" <expr> ")"
 
 That's it. Did you notice the recursion in the expression notation?  Basically,
 
 For example, according to this grammar an expression `8>>1+1`
 will be evaluated to 2 (like in `8>>(1+1)`), not to 5 (like in `(8>>1)+1`),
-because `>>` has lower prece than `+`.
+because `>>` has lower precedence than `+`.
 
 lexer
 -----
 split that stream into smaller tokens, that can be processed later. It gives us
 some level of abstraction and simplifies out parser.
 
-For example, a sequance of bytes "int i = 2+31;" will be split into tokens:
+For example, a sequence of bytes "int i = 2+31;" will be split into tokens:
 
 	int
 	i
 				} else if (nextc == '/') {
 					readchr();
 					if (nextc == '*') {
-						nextc == fgetc(f);
+						nextc = fgetc(f);
 						while (nextc != '/') {
 							while (nextc != '*') {
 								nextc = fgetc(f);

File input/blog/cucu-part2.mkd

 the harder part
 ---------------
 
-As you can see from the language grammar, stataments and various expression 
+As you can see from the language grammar, statements and various expression 
 types are strongly interconnected. It means we have to write all parser 
 functions at once, keeping in mind the recursion. Let's go again from top
 to bottom. Here's our top-level compiler() functions:
 		}
 	}
 
-It reads type name, then an indentifier. If it's followed by a semicolon -
+It reads type name, then an identifier. If it's followed by a semicolon -
 it's a variable declaration. If it's followed by a paren - it's a function.
 Function scans function arguments one by one, and if function is not
 followed by a semicolon - it's a definition (function with a body), otherwise - 
 it's just a declaration (just function name and prototype).
 
 Here, `typename()` is function that just skips the valid type name. We accept
-only `int` and `char` and varoius pointers to them (`char *`):
+only `int` and `char` and various pointers to them (`char *`):
 
 	static int typename() {
 		if (peek("int") || peek("char")) {
 function that parses an expression. Expression parser is a recursive descent 
 parser, so it's a number of functions that call each other recursively until 
 primary expression is found. Primary expression as we can see from the grammar
-is a number (constant) or an indentifier (variable or function).
+is a number (constant) or an identifier (variable or function).
 
 	static void prim_expr() {
 		if (isdigit(tok[0])) {
 * `L` - is a local variable. `addr` stores variable location on the stack
 * `A` - function argument. `addr` also stores the location on the stack
 * `U` - undefined global variable. `addr` stores absolute address in RAM.
-* `D` - defined global valiable. Same as above.
+* `D` - defined global variable. Same as above.
 
 So far, I've added two functions: `sym_find(char *s)` to find symbol by its 
 name, and `sym_declare()` to add a new symbol.

File input/blog/cucu-part3.mkd

 highly portable.
 
 I wanted CUCU to be a portable compiler (actually, a cross-compiler).
-So, I deciced to move backend code generator to a separate module.
+So, I decided to move backend code generator to a separate module.
 
 But before we dive into the backend code generation, let's think of how we will
 test it.
 
 I have chosen a simpler way. Every instruction is 8 bytes long (yes, it's huge,
 but who cares - it's a test imaginary architecture). And the first 7 bytes of
-the instrction are just ASCII symbols, and the last one is 0x10 ('\n').
+the instruction are just ASCII symbols, and the last one is 0x10 ('\n').
 
 This allows us to use human-readable instruction codes, like `A:=A+B`,
 `A:=1ef8`, or `push A`. These seem to be self-explanatory ("add register B 
 * `m[B]:=A` - store the value of A to address stored in B (as byte)
 * `M[B]:=A` - store the value of A to address stored in B (as int)
 * `push A` - push the value of A on the stack
-* `pop B` - pop the valud from the stack to B
+* `pop B` - pop the value from the stack to B
 * `A:=B+A` - add A and B
 * `A:=B-A` - subtract A and B
 * `A:=B&A` - bitwise AND operation
 * `A:=B==A` - A is 1 if B==A, and 0 otherwise
 * `A:=B<<A` - shift left the value of B to A bits
 * `A:=B>>A` - shift right the value of B to A bits
-* `A:=B<A` - A is 1 if B&lt;A, and 0 othersize
+* `A:=B<A` - A is 1 if B&lt;A, and 0 otherwise
 * `popNNNN` - drop NNNN items from the stack
 * `sp@NNNN` - put the value at address NNNN on the stack to the register A
 * `jmpNNNN` - jump to address NNNN
 and to post-process byte code.
 
 Compiler provides the function `emit()`, that emits byte code to the `code[]`
-array. At the very end, `code[]` conatins a ready-to-use compiled program.
+array. At the very end, `code[]` contains a ready-to-use compiled program.
 
 So, compiler calls backend function, and backend just calls emit() with the
 specific byte-codes, and this is how we get compiled machine code.
 right part, and then performs an operation.
 
 This is how a typical math expression is compiled (remember a joke about an
-elefant, a giraffe and a fridge?):
+elephant, a giraffe and a fridge?):
 
 	..evaluate left part
 	push A
 	A:=M[A] # get its value as int
 	push A  # store it
 	A:=0000 # put 0 to A
-	pop B   # get the value of "k" stored earlies
+	pop B   # get the value of "k" stored earlier
 	A:=B==A # compare A and B ("k" and zero)
 	jmz0090 # if false (A!=B, k!=0) - jump to 0x90
 	A:=0001 # store 1 to A as return value
 	A:=0000 # store 0 to A as return value
 	pop0001 # free space allocated for "k" on the stack
 	ret     # and return
-	ret     # remember about double-return for safery? ;)
+	ret     # remember about double-return for safety? ;)
 
 Yeah, the code is so dirty and bloated. But it works. And which is more
 important, you know now how compilers work and how create your own one.