Files changed (1)
+If I have any regular readers, you may have noticed that I have not blogged in a while. That is because I have been attending college at North Carolina State University (Go Wolfpack!). Among the courses I took last semester was CSC 116, Introduction to Computing - Java.
+It wasn't necessarily a bad class, though I was considering to try and place out and jump straight to 216, which is apparently where things begin to get interesting. (On the other hand, considering my grade in Calculus I probably needed an easy course.) But as a person who someday wants to be a teacher of computing, taking the course gave me a perspective on how we could improve computer education for the future. So, let us begin.
+Essentially, CSC 116 is your standard entry-level programming course for computer science, and for other majors (mostly from Engineering, Math, and Science) that want to learn how to do "standard" programming (as opposed to something like FORTRAN or MATLAB). And as with many entry-level CSC courses at colleges worldwide, it uses Java as the language.
+Now, Java has a number of nice properties that make it suitable for an introductory programming language:
+* It has all the important control flow constructs -- if/else if/else statements, for loops, while loops, do/while loops, exceptions, etc. etc.
+There is theoretically BeanShell, but it seems to be dead -- it has not been updated since 2005, as far as I can tell. The other major option is Groovy, which is a separate language altogether. This may seem like a minor fault against Java, but the major advantage of a REPL in an educational context is that it teaches one of the greatest concepts in programming: expressions.
+Then compile and run it, it will print `There are 4 lights.`. Whereas, in a theoretical Java REPL, I can do:
+This doesn't seem like much, but it reinforces that `"There are 4 lights."` is a String value, just like if you had typed it in as a literal. And any objects and arrays you create, any math you do, also produces a value, and you can use it anywhere a value of that type is acceptable.
+That is one of the major breakthroughs I had when I was learning Python -- understanding "everything is a value." REPLs help with that because you can see the individual parts of a larger program, and see how the computer understands those parts, rather than just seeing what happens when you do a bunch of stuff in a row and `System.out.println` it.
+Another advantage of the REPL is that you don't have to learn to do I/O before you can begin understanding values and objects. For example, we were doing `Scanner console = new Scanner(System.in);` before we learned what `System.in` was, what the `new` operator did, or what a constructor was -- even what an object was. Having a REPL would allow us to explore all these concepts of data and data types before having to deal with I/O at all, and then we would have a better conceptual foundation of how to deal with I/O as just another task of a program.
+And finally, with a REPL you don't have an edit/compile/run cycle - you just type and go. Which again isn't a major thing for experienced programmers, but for many people in the class it was their first time using the Linux command line -- E 115, the course that teaches you about the command line and other aspects of computing on campus, is a corequisite, but when you have to deal with the command line on Day 1 of the class it's not much help. (Though I am thankful we were learning on Linux instead of Windows.)
+I won't spend too much time on this, but the way Java's libraries are used is in many cases rather unintuitive. For example, `java.util.Scanner`, a token-based input processor. Which isn't exactly a bad thing, except that the first time you use `java.util.Scanner` it is in the context of line-based console input. For example:
+If I type in `12` and press ENTER, the `Scanner` will pull the `12` from the input stream, but leave the cursor right before the `\n`. Which means that if I have something like this:
+That second call to `console.nextLine` will just eat the leftover newline and try to save to a blank file, which is not at all what is needed. This is an intentional behavior of the `Scanner` class, but it's not useful for "talking" to the computer over the console, and was the source of many a bug.
+One possibility I entertained if I were to be placed in charge of the CSC department is redoing the course structure such that CSC 116 is given in a simpler, easier to deal with language, and then in CSC 216 you learn the syntax of Java in a couple of weeks and then get started with the fancy object orientation. I can think of a few languages that would work well for this:
+* Python is a very nice and regular language (and it forces you to indent, which is something 116 students are very much in need of). Though there is the question of whether to teach Python 2 or Python 3, not to mention that the language itself has plenty of "historical warts."
+* Lua is also a nice and simple language, and the variety of ways in which tables can be used would be good for explaining data structures. It's also very nice and readable.
+* A functional language like Haskell or OCaml would be interesting, but it would be a major paradigm shift switching from 116 to 216.
+* I have been reading quite a bit about haXe recently, and it is a very nice language -- object-oriented and statically typed, but with type inference and a lot of other nice features (plus the multiple runtime support). If not for its obscurity and the difficulties involved in compiling and debugging it, it would be a perfect starter language.
+In any case, for all Java's flaws none of them are really "deal-breakers" to using it as a starter programming language.
+Though Java wasn't really suitable for much of the class, it also wasn't the only problem. The class's curriculum design was also a bit unintuitive.
+We jumped straight into methods and program structure on the second day of class, with programs that looked like this:
+Which is a very superficial (and rather ridiculous) explanation of methods. But what's worse is that when learning about methods that take arguments and return values, the instructors never used the obvious illustration of how it works: mathematical functions.
+`f(3, 2)` means "use 3 for x and 2 for y, and substitute the result of evaluating this expression for `f(3, 2)`." This is the perfect analogy for how formal parameters and return values work (well, maybe not *perfect*, but better than what we did), yet we never used it. And as a result, many of the students never really understood methods.
+The first type of control structure we learned about was the for loop. Unfortunately, it was not the "for each" loop that goes:
+With some slightly different scoping rules. I think it makes far more sense to teach students about if statements first (because remember, this is the first control structure in the class), then while loops so they understand how looping works, *then* explain the for loop as a way to wrap all of that up in a nice, convenient package. Because the C-style for loop really has just about no value over a while besides the fact that it is in a nice, convenient package.
+I mentioned this particular issue to some of my colleagues at the NCSU LUG, and they brought up the fact that a for loop is a "closed" loop, whereas a while loop is open-ended -- i.e. much easier to go into an infinite loop if the student makes a mistake. In theory this is true, but in practice, something like this was a fairly common mistake:
+This code is intended to go from 10 to 1 and then print "Blast off!", but thanks to the use of `i++` instead of `i--` it counts forever, until you eventually hit an integer overflow or press ^C. So, the C-style for loop is in fact an open-ended loop, despite its implications to the contrary, and it is just as easy to accidentally infinitize a C-style for loop as it is a while loop. (Especially with students who don't understand the concept of loops to begin with.)
+In both cases, a truly closed for loop would be far more useful. For example, Lua's numerical for loop:
+This is much harder to break than the for loop because it guarantees that if you use an improper range (i.e. from 10 to 1 with a step of 1) it will simply not iterate, and more importantly it doesn't require introducing conditionals and general loops to understand properly. It could be emulated in Java with something along the lines of:
+(In fact, haXe actually uses syntactic sugar for this with the `IntIter` class -- `1…10` is transformed into `new IntIter(1, 10)`.)
+Similarly, anything that involves iterating over an array without needing to manipulate the items by index could…just iterate over it.
+The only data structure we used in the class was the built-in Java array. Which is very nice for random access where you know the data size in advance, but not so much for just about everything you really need a data structure for.
+For example, the fact that you must specify the size at construct time was a major pain during Project 5. We had to read a data file full of CSC alumni into an array (actually three arrays since we hadn't gotten to objects yet), but we didn't know the size of the file in advance. The solution I used was open the file, read through just to count the lines, close it, create the array, and then *reopen* it and loop through it a *second* time to actually read the data.
+There were very few cases where we used arrays in situations that played to their strength as a data structure, and in cases where they did an `ArrayList` would have been useful so we wouldn't have to work around the size limitations. In fact, in a lot of cases all we were doing is adding data to the end and iterating, which is something much more suited to a `LinkedList`. Not to mention the times where we used arrays as maps and sets…`HashMap` and `HashSet` anyone?
+One thing that bugged the heck out of me was the fact that we covered indentation very little. I wanted to go to the front of the room and deliver a lecture called "CSC 116 Students Need To Learn To Indent Or I Will Kill Them All." It's simple -- after you type a curly brace, press ENTER, then use TAB to move over one space. At the end of that curly brace section, hit ENTER and backspace over the TAB before you type the curly brace. Yet we spent very little time covering it. If we had, then it would have been much easier to spot bugs in people's code due to improper nesting.
+That said, we did at least go over documentation, which is one of my favorite aspects of proper code maintenance. (Javadoc's HTML output is ugly, but unfortunately there's no way to fix that.) But we talked very little about how to write *good* documentation, and the TA's really didn't grade the documentation on quality, so a lot of it was rather superficial. (They did not grade spelling and grammar very well either.)
+We also went over testing. Which is another nice nod to best practices…but unfortunately instead of something related to a test framework, we used a bunch of `System.out.println("Expected: " + expectedValue + "Actual: " + actualValue);`. Not even `assert`. (I usually ended up hacking together my own little test frameworks for the projects where unit testing was required.) JUnit or something would have been nice.
+One important aspect that our Computer Science department is trying to focus on is communication in CSC. So, every class we would have a Review Presentation delivered by a student on the topic of the previous class. In theory, it's a good idea. After all, the best way to know you understand something is to explain it to someone else.
+The problem is that many of the presenters either didn't understand the concept they were supposed to be reviewing, or simply didn't care about the presentation. (Largely because it amounted to about 0.5% of the final grade.)
+And this the biggest problem with the CSC 116 course: the instruction delivered a very superficial understanding of how to program. Very little effort was put into ensuring that everyone understood the concepts -- most TA assistance was simply "get the code to compile and move on" -- and understanding *why* we did something instead of just what to type. And as a result, when time came for people to explain the concepts, the actual code was half-remembered (I get the impression that a lot of people just typed in kind of how to do something and fiddled with it until it compiled), and understanding what each part of the statement *did*, and why you needed it, was shockingly absent.
+So, the best way to improve CSC 116 (in addition to shuffling the curriculum a bit) would be to **focus on how it all works**. A stronger focus on data and data types, on algorithms and problem-solving technique, and on what the computer is doing for you behind the scenes would make it much easier for students to understand programming -- which should be the goal of the class.
+I know that inevitably someone will comment on this article, "That's why you learn programming before you go to university, duh." Unfortunately, the fact of the matter is that often, even students who are going to university for computer science haven't been exposed to programming yet. Getting computer science properly integrated into secondary school is another major goal of mine, but that's a topic for another day.
+But more importantly, it is not just the computer scientists who are taking CSC 116. For example, the guy who sat next to me was an aerospace engineer. Our class also had plenty of people from other degrees seeking a CS minor, especially math majors. For many of them, this is also their first experience with programming, and it needs to be complete, understandable, and relevant so they understand how programming can benefit their life.