First Project: Denial, Anger, Bargaining, Depression, Acceptance

Posted by Maria Galdina on January 19, 2020

We all started our journey to the word of Data Science full of enthusiasm, curiosity and hopes. It was very interesting to start so big program which can change your life. At December everything for me was sunny and clear. Our group went through all task and lectures very smoothly. It was wonderful time to met Python, Data Sets and mysterious environment Git. We found that NumPy isn’t an exotic fruit and Pandas can work with data frames, not only eat bamboo. I met my study group and heard they stories how they come to this Bootcamp. But time go so fast and it comes. Time for first project.

**Denial - Nice day! Nice time! Nice Dataset! **

Every day we solve million small and huge problems, answer questions and improve ourselves. But!!! Times for the first challenge is here. I read a description for the project first time and decided - it’s cool, pretty easy and can be solved for hours schools gave us, maybe even faster. We worked a lot with Jupyter Notebook, every day we saw few good organized readme.md, I still have my Keynote on mac. Movie-related data from most popular sites looks better then I can get from this sites by myself using BeatifulSoup. (I really tried a lot of times, but sites fought to the last for information and won. I really need to improve my Web Scrapping skills.) Denying difficulties of this project, I made few bad decisions about my time distribution and projects solutions. They cost me a good organized readme.me on GitHub and good non-technical presentation, because I spent too much time for questions that I did not include in the result presentation.

Anger - It can’t be true!

First project - first problems. I realized how difficult it is for me to pick a nice, interesting topic to cover in the project. For closely look to data I used a sqlite3 library and converted files from .cvs format to tables. When I inspected tables, I realized that all data sets from different sites contain good information, but they cannot be connected together. All tables have different key fields, empty values and should be converted to a specific type of data to work with them. Data cleaning makes my brains boil, but at this stage of project questions started to loom. I had seen, I can calculated averages for some columns and create easy plots. But questions for myself still haunt me and I cannot figure out main task for my research. Real anger come closer to me and repeated “Star Wars: Episode 4 - A New Hope” it is the same as “Star Wars: A New Hope (Episode 4)” and maybe few more variations. That means I lost a pice of information about budgets if I try to watch closely information about each movie. Only fast data converting gave me power to go forward. Budgets, dates, runtime and genres don’t gave me reason to look for another set of information.

**Bargaining - Okey Google! **

It’s time when I go to bargain with my brains and decide to read articles about movie companies and how data scientist work with them. May be while I try to visualize how many films in each genres different studios produce, those studios trying to analyze absolutely different information. I discovered, data science effectively changed the methods used to turn data analytics into useful insight and profit. Studios analytics are often externally sourced, using information drawn from the internet, public data sources, and more to make more accurate predictions. For me, that means I continue working with my lovely data columns which cover economic information. I decided to go through simple way to implement my new knowledge into my brain and understand it for using in the new projects without complex connections between my datasets.

**Depression - Deadline coming! **

The Day is closer and closer, but I still want to do more and more, so much I should add to all my files: comments, new functions, new plots.

Acceptance - Just be yourself.

I did my best at this time. I know my data analysis does not cover a lot of questions and I cannot be proud of absolutely clear data. But now I see where to strive and what to do to improve my knowledges and skills. How important is to write a comment for all parts of your code and read logs.

Overall, It’s was very interesting experience through which the school teachers conducted us. I made a lot or notes about technology, allocation of my time and resources.