Week 3, June 4th - June 8th

Monday, June 4th

Happy to be back after a relaxing weekend. I was quite ready to work more on the CYK Parsing Code. I decided that most of today would be dedicated to figuring out someway to not only keep track of all the possible ways to parse a string using CYK, but to also recreate one (or more) of those derivations. Through the use of a backtracking recursive method and a node class I designed, I created a way for derivations to be created, without any change to the already existing code. After having designed a class that was essentially a double map due to the fact that 2D arrays don't work with generics, with Julian's help I discovered we could trick the compiler into accepting the array, completely getting rid of the need for our map class. As much of parsing has the need to be threaded, I also started reading through the Java tutorial on threading so I can program that in soon. Peggy and I also brainstormed a little about how we want the visualization to appear and how the user should interact with it. Hopefully with Julian, we will come up with a great design that fits in with the rest of our GUI.

Tuesday, June 5th

Thought I had finished with all the code for the CYK Algorithm, but another day means more things to fix. To reduce the need for threading as well as to fit easier with a GUI framework, we redesigned the entire Parsing package to require a whole new set of methods and design. The two biggest additions were the modification of parsers to be Steppable, that is, give them the option to be done one step at a time (with a step-to-completion option to do the entire algorithm) as well as a getDerivation method that outputs the specific productions in order that were done to derive the string. It took a lot of code change and thinking outside what we had already done, but the CYK Parser is one step closer to being GUI-ready. I started brushing up on my Swing knowledge and will continue that tomorrow so I can help with the visualization of this (and other) algorithms.

Wednesday, June 6th

Another long day of parsing! This time however, it was only a little bit about CYK and much more focused on fixing the Brute Parser. The lunch break was quite cool today, it was really interesting to see what possibilities there are to increase the capabilities of mobile gaming. After that interesting presentation, Peggy and I went back to work on the parsers. We also sorted through a lot of code looking at ways to optimize our algorithm. The old code is a little tough to comprehend, but hopefully tomorrow we'll sort out some good optimizations. I also went through the swing tutorial and coded an extremely simple digital clock.

Thursday, June 7th

Today we played around a lot with the new brute parser, trying to find ways to optimize it and figure out if it's as efficient as the old one. In doing so, we found out that whatever node count that the old BruteParser displayed as being created was a little arbitrary. Our new parser states it is creating a LOT more nodes, but is able to handle strings that the old one couldn't, and doing it faster, so yay for us! We also started discussing how we're going to save everything, and that will be a nice refresher/learning experience in XML. If that wasn't enough, we also starting thinking and doing some really basic coding dealing with our sets/languages project that may take up the majority of the rest of the summer.

Friday, June 8th

I can't believe how much we've accomplished this week! Not only did we create two new parsers and optimize them, but today we worked on getting our Set/Language project underway. I wrote classes today that implemented basic Set operations such as intersection, union, difference, and powerset. I also decided to help Peggy out on finding an algorithm to generate all the strings in a giving language (which as of right now is created with either a specific grammar or automaton). After a lot of brainstorming and code that couldn't find strings longer than one character long, I got two different methods. One (which is O(n^m) where n is the number of terminals and m is the length you want to generate, so is about as bad as you can get) can find every single string in order of length that a language can generate, while the other (much faster) uses the framework of our Brute Parser to generate a lot of strings that can be generated, just not EXACTLY in order. Hopefully we'll figure something out that can be better than either, but for now, being able to produce sentences from nothing is really cool.