It is impossible to overestimate the significance of architecture that surrounds us when we grow up in the shaping of our emotional intelligence and our intellectual makeup in general. The reason hides in the fact that, unlike the beauty of nature, which makes us feel an infinitely small and therefore kind of unimportant element of creation, the beauty of art, especially architecture due to its magnitude and impressiveness, on the contrary, boosts our self-esteem by implying that it has actually been created by our fellow human beings and really making us stop and think about what we ourselves may be capable of. In object oriented terms it let us experience first-hand what “belongs to” is about on an emotional level (@@the_greatest_people_ever_lived « self, if you will).
As little as all of this has to do with programming and especially with the CLI Data Gem, instead of making my own life easier and following the sensible advice in the Readme concerning possible subjects, I couldn’t resist the temptation of using this opportunity and trying and finding a website with architectural landmarks of St. Petersburg, Russia – the city where I was actually born and lived for some time. It’s probably worth mentioning that the city was built in the beginning of the XVIII century perhaps in the unlikeliest place on Earth – in the middle of a swampy loamy delta of the Neva river by one of the greatest visionaries in the history – Peter the Great, who in many ways was centuries ahead of his time. To implement his vision, he combined the efforts of the best architects in Europe and created what is considered the most architecturally beautiful city in the world.
Finding a website with a clear enough structure for the purpose of this project actually took some effort, but it was worth it because the site I eventually found turned out to be surprisingly scraping-friendly. The landing page contains what essentially is a heading of the project as well as all 20 architectural attractions with photographs and brief descriptions. Scraping this page was quite straightforward in terms of CSS selectors in the sense that the project heading responded to <h1> and names of the landmarks to <h2>. This allowed me to create an array of landmark instances and print out a numbered list of the landmarks upon the launch of the application, right in the beginning of the CLI logic. For this purpose I created two classes under the Landmarks module – Landmark and Scraper, with strict adherence to the single responsibility principle, so that the Scraper class was only responsible for scraping the data and the Landmark class was in turn responsible for producing its instances with attributes made out of the scraped data.
All scraping based OO applications may have a common denominator in the sense that we only need one instance of a Scraper class. I am not sure whether more than one instance of such a class is actually going to break the application, but it is redundant, contradicts the DRY principle, and therefore, to my understanding, should be avoided. Since the Scraper class must provide the data to the Landmark class to be assigned to its instances as properties, communication between these two classes is necessary. On the other hand, the CLI also needs something from the Scraper to be able to firstly launch the initialization of all the Landmarks instances via scraping and secondly produce the numbered list of the landmarks. To ensure the integrity of the CLI – Landmark – Scraper chain without any unnecessary action between the CLI and the Scraper, I assigned the only instance of the Scraper to a class variable of the Landmark class and made it its property via a class reader method. Then in the CLI I assigned this Scraper property to a variable and subsequently called all the instance methods of the Scraper on this variable. I don’t know how conventional or rational this kind of design is, but this way the CLI doesn’t actually have to be aware of the Scraper class as such and instead it just deals with a property of the Landmark class assigned to a variable.
After welcoming the user and displaying the numbered list of the landmarks, the CLI logic launches the #start method and prompts the user to enter the number of the landmark they want to know more about. If the user enters an invalid number, i.e. other than 1 to 20, instead of terminating itself, the application prompts the user to re-enter a valid number using a simple if-conditional. Upon entering a valid number of a landmark, the application displays its detailed description (which in terms of scraping responded to just a <p>) and offers the user to check the availability of directions, contact information and business hours. Scraping this data also only required the use of just one CSS selector. The user is offered an opportunity to exit the application twice – after familiarization with the description of the landmark and after checking its other information. At those exactly same points the user is asked whether or not they’d like to know more about another landmark. Upon exiting, the application thanks the user for their interest and welcomes them again the future.
There are a couple of things I’ve encountered writing this application that may be worth sharing. Firstly, when iterating through the array of instances of the Landmark class with the intention of printing out a numbered list using the each.with_index method, the application kept displaying the numbered list in an absolutely wrong format with extra new lines and names above the index numbers instead of next to them. The reason turned out to be that not all the elements of the instance array were typed in the same format – some of them contained “\r\n” before and after and some didn’t. I am not sure why, but .strip didn’t seem to be able to solve the problem, and eventually I ended up calling an explicit substitution .gsub(“\r\n”, “”) on the instances that contained these extra “\r\n” via an if-conditional.
Another maybe less significant, but still kind of strange thing was that, when running the application, the CLI ignored the “\n” expression and didn’t print an empty line, but “\n\n” produced two empty lines which wasn’t the intention. To print just one empty line, I ended up using “\t\n”.
All in all I must say that the experience of making this first application more of less from the scratch has given me a kind of an idea about the sense of satisfaction and fulfillment programmers must feel when their creations are up and running and serving a purpose in the real world – the experience in fact so inspiring that I can’t wait to be in the position of being able to writing more complex and useful applications.