Google Ngrams Analyzer

Google offers a service known as ngrams that gives access to each time a word or phrase is used in a certain year in a certain amount of books from all books every written (to the extend that they have). They offer the raw data for download so what I did was make a c++ program that parses and analyzes this data to find the most common word of phrase every out of all books.

Continue reading Google Ngrams Analyzer

Common Phrase

I made a c++ program that looks through a text file named data.txt and finds the most common x word phrase where x is a chosen number. In the screenshot I used the most common 3 word phrase and listed the top 20 (I can also list the least commonly used phrases) from a file that contained a bunch of wikipedia articles that I crawled using a java web crawler I wrote. This project was particularly difficult to optimize. The first version of the program was very very slow, but I soon used lots of pointers, lots of optimization techniques I thought of, and clever memory management, the final version is lightning fast compared to its predecessor.

CommonPhrase

Big Number

I wrote a BigNumber class in c++. This project taught me so much about polymorphism and object orientation in c++. The class I wrote allows for the addition and subtraction of 2 numbers of any size. I used this class to generate very large Fibonacci numbers.

BigNumber